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ABSTRACT 


This  vark  begins  with  a  study  of  individual  decision-making 
under  uacertaintu*  a  probles  which  we  fannulate  as 
1 

(1)  MRxini  Je  subject  to  gj^(x,3)  >0,  i  =  1,  . . .  ,  m  , 


»diere  x  is  a  decision  n- vector,  P  is  a  b- vector  of  exogenous 
variables  an^paraceters  of  the  decision  model,  f  is  an  objective 
f’snetion  to  he  maxiBized,  and  the  g^  are  constraint  functions 
which  deterfaine  the  set  of  feasible  decisions.  The  source  of  uncer¬ 


tainty  is^  e,  which  is  known  only  to  lie  in  a  given  set  B.  We 
also  collider  the  case  in  which  a  probability  distribution  over  B 
is  giv«^. 

Several  atethods  for  circumventing  uncertainty  in  the  constraints 

are  briefly  reviewed,  and  several  decision  criteria  for  circumventing 

uncertainty  in  the  objective  function  are  discussed.  Particular 

attention  is  devoted  to  the  demonstration  of  certain  relationships 

between  these  criteria.  It  is  concluded  that  vector  maximum  reformu- 

_ ^ 

laticns  ofj^)  play  a  prominent  role  in  dealing  with  uncertainty  in 
such  decision  prbbleais. 

A  vector  prcbless  is  of  the  form 


"Maxinize”  f, (x) ,  . . .  ,  f  (x) 
z  ^ 

subject  to  g^{x)  >0,  i  =  1,  . . .  ,  m 


The  quotation  sarks  signify  that  it  is  desired  to  find  all  efficient 
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V 


decisiOTs,  i.e.,  all  decision  vectors  satisfying  the  constraints 
such  that  it  is  ia^ssible  to  achieve  an  increase  in  any  one  objective 
function  vithout  violating  the  constraints  or  decreasing  at  least 
one  of  the  other  objective  functions.  In  Chapter  II  we  discuss  two 
aethods  for  transfonning  a  vector  maxisum  problem  into  an  equivalent 
P®ra»etric  prograaBdJsg  ^obl^^  Existing  coiqiutational  methods  for 


the  latter  problesis  are  briefly  surveyed 

Ihe  principal  contribution  of  this  work  is  presented  in  Chapter  III 


<a.?  a  class  of  algoritfams  for  solving  parametric  concave  programming 
problesR^^f  the  form 


(5) 


Maximize  ocf^(x)  +  (l-a)f2(x) 


subject  to  g^(x)  >  O,  i  =  1,  ,  m 


for  each  fixed  value  of  a  in  the  closed  interval  [0,1]»  where 
(i  =  1/2)  are  strictly  concave  functions,  g^  (i  =  l,...,m) 
are  concave  functions,  and  certain  additional  regularity  assumptions 
are  Bade.  Under  these  assiaaptions  it  is  shown  that  (2)  (with  r  =  2) 
and  (3)  are  equivalent  in  the  sense  that  x°  is  efficient  in  (2) 
if  and  only  if  x°  solves  (3)  for  seme  value  of  a  in  the  unit 
interval.  The  present  class  of  algorithms  is  not  "simplex-like" 
or  "gradient"  in  nature.  Vat  proceeds  by  maintaining  a  solution  of 
the  Kuhn- Tucker  Conditions  as  a  varies  by  smai  i  increments  (under 
our  assi*ptions  these  conditions  are  necessary  and  sufficient  for 
an  optimal  solution  of  (3))»  The  main  algorithm  given  herein  displays 
quadratic  convergence  at  e€w:h  increment  of  a.  A  simple  modification 
for  handJing  linear  equality  constraints  is  indicated. 


vi 


jtjf'  also  subsuaes  Hxe  standard  (non>para&etric)  concave 

^  £i.  ^ 

progTHBBing  probles  vhen  a  feasil>le  solution  is  known.  Hius  the 
present  algoritbBs  provide  a  defoomation  method  of  concave  progranming. 
Since  many  of  the  results  of  this  chapter  hold  for  much  more  generjlr  | 
parmtetric  problraas  than  (5)>  Jaoreover,  the  present  8lgorJ;fehlfis  are 
pertinent  to  sensitivity  analysis  applicatiofis.  ^ 

The  final  chapter  presents^  numerical  example^which  illustrates 
the  solution  of  a  decision  problem  under  uncertainty  by  means  of  the 
techniques  discusse<^^  the  preceding  chapters. 
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Notation 


X  =  is  a  decision  vector  in  (n-dimensional  Euclidean 

space),  and  is  under  the  control  of  the  decision-maker 

P  -  is  an  uncertain  vector  in  E^  representing  exogenous 

variables  and  model  parameters,  and  is  not  under  the  control 
of  the  decision-maker 

f(x,P)  is  a  real-valued  criterion  function  which  is  to  be  maximized; 

if  there  is  no  dependence  on  P,  we  write  f  (x) ;  if  there 
are  severail  criterion  functions,  we  write  f(x)  for 
(f^(x),...,f^(x)) 

^(x,p)  s  (g^CxjP) ,  • .  •  ,gjjj(x,p) )  is  a  real  vector-valued  constraint 
function;  if  there  is  no  dependence  on  p,  we  write  g(x) 

{2  €  Z;  2 .  has  property  p)  denotes  the  set  of  all  elements  z 

in  the  set  Z  which  have  property  P;  when  Z  is  omitted, 
it  is  isplicitly  understood  to  be  the  pertinent  universal 
set 

X  is  a  subset  of  consisting  of  the  feasible  decisions;  often 

X  represents  {x:  g(x)  ^  O) 

B  (in  Chapter  l)  is  a  subset  of  E^  which  is  known  to  contain  the 
"true  realization"  of  p 

X  >  (>)  0  signifies  x^  >  (>)  0  (i  =  l,...,n) 

xi 


X  >  ^  signifies  x  >  0  but  x  0 
|i  denotes  a  probability  distribution  over  B 

C  C. )  D  signifies  that  the  set  C  is  a  (proper)  subset  of  D 
n 

5^  (x 

i=l 

of  radius  r 

?(a)  denotes  the  maxisum  ot-fractile  criterion  (see  problem  (4.5) 
of  Chapter  I) 

A(K)  denotes  the  aspiration  criterion  with  aspiration  level  M 
(see  problem  (4.6)  of  Chapter  I) 

[a,b)  =(t€E^:  a<t<b) 


an  open  neighborhood  of 


».(x^)  =  ^x;  J 


(Rx)  denotes  the  parametric  prograsming  problan  considered  in 

Chapter  III;  the  parameter  a  may  vary  in  this  notation 
(there  is  no  relation  between  this  usage  of  a  and  that 
of  Chapter  I) 


f(x;a)  =  ctf,(x)  +  (l-a)f2(x) 


9f(x) 


;=_|  ,  the  gradient  of  f(x) 


S  denotes  a  subset  of  constraint  indices;  S  C M,  where  M  is 
the  set  of  the  first  m  positive  integers 


~  ('i-j  ^  '  jU-jj)  denotes  the  dual  variables  associated  with  the 

Kuhn- Tucker  conditions 


xii 


(3aVl),...,(KT-l^)  are,  collectively,  one  version  of  the  Kuhn- 
TUcker  conditions  associated  vith  (Pa) 

(=S)a  is  a  more  co«plete  notation  for  the  equations  (KT-l)  and 
(KT-2)  ;  S  and  a  may  vary  in  this  notation 

(x*(a),  u*(a))  is  the  optimal  solution  and  dual  variables  of  (Pa) 
as  functions  of  a 

s  s 

(x  (a),  u  (a))  is  a  solution  of  (=S)a  as  a  function  of  a 

^^f{x)  denotes  the  matrix  of  second  partial  derivatives  (i.e.,  the 
hessian)  of  f(x) 

<  ^  **■  means  that  the  (infinite)  sequence  x^,x^, . . .  ,x'', . . . 

converges  to  x° 

C-D  denotes  the  j>oints  in  the  set  C  vhich  are  not  in  the  set  D 

•A 

Aa  =  [i  e  M:  u1^(a)  >0),  the  set  of  active  constraints  at  Qj  a 
may  vary  in  this  notation 

Ba  ^  {i  €  M:  g  (x*(a))  =0),  the  set  of  binding  constraints  at 
x*(a) ;  a  may  vary  in  this  notation 

aKj  =  1, ...,1?)  are  the  points  of  change  of  AX  or  of  Ba  in  the 

unit  interval;  a'  is  a  generic  term  for  a  point  of  change 

a'*  is  an  arbitrary  point  strictly  between  two  points  of  change 

_  ^  _  _ 

la‘  =  a'+i],  where  £  is  defined  immediately  above  Theorem  k.2, 

Chapter  III 
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CaiAPTKR  I 


On  the  Relevance  of  the  Vector  Maximum 
Problem  to  Decision-Making  Under  Uncertainty 

1.  Introduction 

Ihis  chapter  addresses  a  problem  of  individual  decision-making 
under  uncertainty  of  the  form 

(1)  Maximize  subject  to  ^(xjP)  ^0  / 

X 

vbere  x  =  (x^,  ...,x^)  is  the  decision  vector,  P  =  (Pj^,...,P^)  is 
a  vector  of  exogenous  variables  and  parameters  of  the  mod.el,  f  is 
the  objective  (or  criterion  or  payoff)  function  to  be  meiximized, 
and  g  =  (gj^,...,g^)  is  a  vector- valued  constraint  function  which 
determines  the  set  of  feasible  decisions.  We  assume  that  the  functions 
f  and  g  are  known,  but  that  P  is  known  only  to  lie  in  a  given 
set  B  C  where  is  b-dimensional  Euclidean  space.  Often 

we  ^haH  1  make  the  additional  assumption  that  P  may  be  regarded  as 
a  randtxi  variable  with  a  known  probability  distribution  over  B. 

A  choice  of  x  must  be  made  before  P  is  found  out,  if,  indeed, 
it  ever  is  revealed  to  the  decision-maker.  Throughout  this  chapter, 
no  experimentation  is  permitted  in  order  to  reduce  uncertainty  about 

&• 

If  P  were  known  exactly,  then  (l)  would  be  a  well-defined 
problem  (providing  that  the  desired  maximum  exists,  of  course), 
aat  we  have  assumed  that  p  is  unceirtain,  and  so  (l)  is  not  well-defined. 
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There  are  two  distinct  aspects  of  the  difficulties  arising  from 
uncertainty  in  the  set  of  feasible  decisions  is  uncertain,  euid 

the  objective  function  is  uncertain.  Maximization  cannot  be  performed 
until  the  constraints  and  objective  function  are  reformulated  so  as 
to  be  independent  of  p.  We  shall  discuss  a  variety  of  such  reformu¬ 
lations,  and  it  will  be  seen  that  quite  frequently  vector  maximum 
reformulations  play  a  prominent  role. 


Vector  Maucinxim  Problem 

A  vector  maximum  problem  arises  whenever  there  is  more  than  one 
objective  function  to  be  extremized.  Consider  the  problem 

(2)  "Maximize"  f(x)  , 

X  €  X 

where  £(£)  =  >•  •  •  »fj.(x) )  is  a  vector-valued  object! *re  function  C  ) 

(each  CGcponent  of  f  represents  an  objective,  usually  non-additive 
with  the  others,  which  the  decision-msiker  wants  to  maximize) ,  and 

,  r  . 

^  ^  ^  IS  a  set  oj.  feasible  decisions.  In  the  fortunate  event  that 
each  ccsmponent  of  the  objective  function  reaches  its  maximum  simul¬ 
taneously,  as  in  Figure  1,  then  (2)  is  said  to  have  a  perfect  solution. 

In  ssneral,  however,  an  improvement  of  one  objective  beyond  a  certain 
point  can  only  be  obtained  at  the  expense  of  worsening  another. 

Suppose  that  for  a  feasible  decision  x°  there  exists  no  other  feasible 
decision  x  such  tha'C— ^  ^  •  Then  x°  is  termed  an 

T7~ 

In  thus  work  we  adopt  the  convention  that  x  >  0  signifies 
Xi  >  0  (i  =  1,.. . ,n),  x>0  signifies  >  0  (i  =  1, . . . ,n)  and 

>  0  for  at  least  one  i,  and  x  >  0  signifies  >  0  (i  =  l,...,n). 
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efficient  solutioi^^  of  (2).  The  quotation  marks  in  (2)  signify 
that  It  is  desired  to  find  all  efficient  solutions.  When  they  are 
all  found,  the  vector  maxinum  problem  (2)  has  been  solved. 

When  f  has  only  two  or  three  components,  ve  envision  determining 
the  entire  set  of  efficient  solutions  and  presenting  the  corresponding 
outccmes  in  graphicsQ.  form  to  the  decision-maker,  who  would  then 
subjectively  determine  a  trade-off  between  conflicting  objectives 
and  thus  make  the  final  selection  of  a  decision.  Figures  1  and  2 
iU-ustrate  the  graph  of  attsiinable  outcomes  for  two  hypothetical  cases 
iDYolving  two  objective  functions.  The  efficient  outccmies  are  denoted 
by  the  heavy  line  and  dot. 


-Figure  1  Figure  2 


In  many  applied  decision  problems,  even  in  the  absence  of  \incer- 
tainty,  there  are  several  objective  functions  which  naturally  present 
themselves  to  the  decision-maker.  In  such  situations,  the  relevance 


—  The  notion  of  an  efficient  solution  is  essentially  the  same  as 
the  notion  of  ’’undcminated"  or  "admissible"  decisions  in  decision 


theory,  and  the  notion  of  "Pareto  optimality"  in  game  theory  (see 
liice  and  Baiffa,  1957»  P*  287  and  p.  118) . 
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of  "the  vector  maxi mum  problem  is  obvious,  and  need  not  be  emphasized 
further.  What  we  do  wish  to  emphasize  is  that  in  the  presence  of 
uncertainty  even  a  single-criterion- function  problem  such  as  (l) , 
which  we  would  accept  as  the  "correct"  formulation  if  p  were  known 
exactly,  tends  to  explode  into  vector  maximum  reformulations  when 
one  attempts  to  turn  it  into  a  well-defined  problem. 

Plan  of  Discussion 

Because  uncertainty  in  the  constraints  is  fundamentally  different 
trees,  uncertainty  in  the  objective  function  of  (l),  we  split  our  dis¬ 
cussion  into  two  parts:  in  section  2  we  consider  ways  of  reformulating 
the  constraints  so  ais  to  be  independent  of  p,  and  in  section  3  we 
consider  ways  of  reformulating  the  objective  function  so  as  to  be 
independent  of  p  (this  is  usually  known  as  invoking  a  decision 
criterion) .  These  two  steps  must  be  accomplished  in  order  to  convert 
(1)  into  a  well-defined  problem.  The  conversion  usually  can  be 
accomplished  in  several  ways,  reflecting  various  compromises  which 
may  be  made  to  uncertainty  in  P,  realism  in  the  final  model,  and 
ctacpiitational  considerations. 

In  section  2,  three  reformulations  of  the  constraints  will  be 
discussed:  permanent  feasibility,  the  penalty  function  reformulation, 
and  probabilistic  constraints.  The  first  two  do  not  require  a  proba- 
bixity  distribution  over  B,  while  the  last  does.  The  last  two 
reformulations  sometimes  lead  to  a  vector  maximum  problem. 

In  section  3  we  consider  several  decision  criteria,  and  some 
relations  between  them  are  noted.  We  suggest  that  a  given  decision 


h 


problen  should  be  attacked  by  several  decision  criteria  rather  thaui 
by  only  one.  Ihe  result  is,  of  course,  a  vector  maximum  problem.  Two 
examples  are  presented  which  demonstrate  the  usefulness  of  considering 
two  criteria  simultaneously.  The  second  example  is  a  one-period 
inventory  model,  and  an  argument  is  given  for  deviating  from  the 
now  classical  solution. 

2.  Treating  Uncertainty  in  the  Feasibility  Constraints 

This  section  is  essentially  a  review  of  some  of  the  existing 
wa:fs  of  circuDwenting  uncertainty  in  the  constraints,  and  is  included 
mainly  for  completeness.  Mixtures  and -variations  of  these  basic 
approaches  cajn  be  improvised  to  cover  most  particular  applications. 

The  Pemanent  Feasibility  Reformulation 

To  be  absolutely  sure  of  choosing  a  feasible  decision,  choice 

mist  be  limited  to  those  values  of  x  which  are  feasible  for  all 

P  €  3.  That  is,  restrict  attention  to  the  set^^  n  (x:  g(x,p)  >0) 

P  e  B  “  ~ 

( see  Madansky,  I962  and  I963) • 

An  obvious  difficulty  with  this  reformulation  is  that  when  B 
is  "large,"  the  permanently  feasible  set  is  apt  to  be  "small,"  and  even 
may  be  empty.  When  the  maximization  operation  is  i>erformed  subsequently, 
there  may  be  little  opportunity  to  achieve  a  satisfactorily  high  value 
of  the  objective  function. 

^  We  adopt  the  notation  of  using  braces  to  denote  sets  in  this  work. 

■0ie  symbol  0  denotes  the  en5>ty  set. 


Ihe  Penalty  Function  Reformulation 

The  above  reformulation  does  not  admit  the  possibility  of  ever 
choosizig  a  decision  »»hich  is  infeasible.  What  does  it  mean  to  say  that 
a  decision  x'  is  "infeasible"  when,  say,  p'  obtains?  Mathematically, 
we  have  g(x*,P')  ^0,  which  means  that  either  (x',P')  is  physically 
impossible,  or  is  physically  possible  but  "undesirable"  (we  are  dis¬ 
tinguishing  between  those  constraints  which  are  dictated  by  the  physical 
limitations  of  the  system  amd  those  which  are  Imposed  at  the  model- 
maker's  discretion).  In  the  second  case,  it  may  be  possible  to  take 
additional  action  in  order  to  make  the  outcome  less  "undesirable," 
or  at  least  to  pay  a  price  for  being  "infeasible."  Denote  this  "price" 
by  p(x*,p*),  not  necessarily  measured  in  dollars.  Note  that  p  is, 
in  general,  a  vector-valued  function,  reflecting  the  fact  that  vio¬ 
lations  of  different  constraints  may  imply  different  dimensions  of 
disutility.  For  example,  consider  an  investment  portfolio  optimization 
model  which  has  as  its  objective  the  maximization  of  portfolio  worth 
at  the  end  of  a  specified  horizon.  One  constraint  may  specify  a  desired 
level  of  diversification  (e.g. ,  a  maximum  of  of  the  portfolio  in 
defense  industries),  and  another  constraint  may  specify  a  lower  bound 
on  the  average  Standard  and  Poor’s  quality  rating  of  the  securities. 
Violation  of  each  of  these  constraints  would  be  measured  in  different 
units  from  the  unit  of  measiirement  of  the  objective  function. 

Ihe  penalty  function  reformulation  of  (l)  results,  in  general, 
in  a  vector  maximum  problem  of  the  form 

(3)  "Maximize"  f (x,p) ,  -£(x,P)  . 

X 
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An  important  special  case  arises  when  jg  has  but  one  component, 
and  this  component  is  additive  with  f.  This  reformulation  then 
become  3^5^ 

(5-1)  Maximize  [f(x,p)  -  p(x,p) ]  . 

X  ~ 

All  of  the  two-sta^e  "stochastic  progranming"  problems  (see,  e.g. , 
Dantzig,  1955,  Madansky,  1962,  and  Mangasarian  and  Rosen,  1964)  can 
be  thought  of  as  penalty  function  reformulations.  The  basic  idea  of 
these  problems  is  to  append  a  second  stage  to  the  original  problem 
to  "correct  for"  possible  infeasibility  of  the  original  decision;  p 
then  represents  the  minimum  cost  of  correcting  for  an  infeasible  x, 
as  affected  by  the  then  known  actual  value  of  p.  The  usual  example 
of  a  situation  in  which  the  two-period  formulation  may  be  appropriate 
is  the  case  of  a  manufacturer  who  is  committed  to  produce  to  satisfy 
an  unknown  demand  p  for  his  perishable  products.  If  all  of  the 
demand  is  not  satisfied,  then  he  purchases  the  difference  on  the  open 
market. 

Probabilistic  Constraints 

Assume  that  p  may  be  regarded  as  a  random  variable,  and  that 
its  probability  distribution  over  B  is  known. 

jn - - - 

3Sote  that  (l)  can  be  written  equivalently  in  this  form  if  p  is 
taken  to  be  arbitrarily  large  for  infeasible  ccxnbinatlons  of  x  and 
and  equal  to  zero  for  feasible  combinations.  For  example,” 

Maximize  [  Inf  [f(x)  +V  u,g,(x,p)]]  . 

X  u>0  ”  - 


The  notion  of  permanent  feasibility  may  be  relaxed  if  one  requires 
merely  that  each  or  all  of  the  constraints  must  hold  with  at  least 
seme  prescribed  probability.  For  example,  consider 

Maximize  f(x»P) 

X 

subject  to  Prob[g^{x,p)  ^  O]  >  f  1  —  f  TH  f 

where  0  <  a^  <  1  (i  =  1,  ...,m).  Charnes  and  Cooper  (1959>  1963) 
refer  to  this  as  "chance-constrained"  programming.  Note  that  when 
each  a^  is  nearly  one,  this  reformulation  approaches  the  permanent 
feasibility  reformulation. 

Another  probabilistic  constraint  reformulation  is 

Maximize  f(x,jP) 

X 

subject  to  E[g(x,p) ]  >  0  , 
where  "E"  denotes  expectation. 

As  an  alternative  to  the  formulations  above,  one  may  incorporate 
some  or  all  of  the  probabilistic  constraints  in  the  objective  function, 

e.g. , 


"Maximize"  f(x,p)  ,  Erob[ gj^(x, p)  >  O] 

X 

subject  to  Prob[g^(x,p)  >  O]  >  a^  ,  i  =  2,  3,  •  • .  ,  m 

The  efficient  solutions  to  the  resulting  vector  maximum  problem  show 
clearly  the  available  trade-offs  between  the  original  objective  function 
and  assurance  that  various  of  the  constraints  will  be  met. 
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3-  Treating  Uncertainty  in  the  Objective  Function 

In  section  2  Ve  discussed  several  ways  of  reformulating  the 
constraints  so  as  to  be  independent  of  p.  Here  we  assume  that  this 
has  been  accomplished,  and  discuss  several  ways  of  reformulating  the 
objective  functions  so  as  to  be  independent  of  p.  For  the  sake  of 
simplicity  of  discussion,  we  shall  treat  the  case  of  but  a  single 
objective  function,  so  that  the  problan  to  be  considered  in  this  section 
can  be  rewritten  as 

(^)  Maximize  f(x,p)  . 

X  €  X 

As  before,  p  is  known  to  lie  in  a  given  set  B,  and  X  is  the 
set  of  feasible  decisions. 

Since  it  is  necessary  to  choose  a  decision  x  before  P  is 
revealed  (if  it  is  ever  revealed),  f(x,p)  must  be  replaced  by  a 
known  function  of  x  alone.  That  is,  (4)  must  be  reformulated  as 

(^•O)  Maximize  f(x) 

X  €  X 

where  f  is  a  known  function  to  be  chosen.  The  choice  of  f  in  a 
given  situation  is  eq^valent  to  what  is  custcanarily  known  as  the  choice 
®  decision  criterion.  If  a  decision  is  an  optimal  solution  of 
(4.0),  it  is  said  to  satisfy  the  decision  criterion  which  produces  f(x) 
from  f (x, P) . 

■^4er  first  discussing  two  alternative  restatements  of  (4) ,  we 
shall  briefly  summarize  the  admissibility  criterion,  the  maxmln  payoff 
criterion,  the  estimate  criterion,  and  the  Principle  of  Insufficient 
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Reason.  Ttie  difficulty  of  finding  a  single  ideal  decision  criterion 
is  well-known,-  and  so  we  take  the  position  that  it  may  be  more  useful 
to  select  two  criteria,  each  with  distinct  merits  of  its  own,  and 
recast  (4)  as  a  vector  maximum  problem  (each  component  of  the  vector¬ 
valued  objective  function  is  derived  from  one  decision  criterion). 

An  example  is  presented  to  illustrate  the  possible  advantages  of  such 
a  procedure. 

We  then  shall  assume  that  a  probability  distribution  over  B  is 
given.  The  concept  of  stochastic  admissibility  is  introduced  as  a 
generalization  of  the  ordinary  concept  of  admissibility.  Next  we 
examine  three  decision  criteria  for  reducing  (4)  to  a  well-defined 
problem  with  heavy  emphasis  on  a  geometric  motivation  for  each  in 
order  to  gain  insight  and  understanding.  These  are  the  maximum 
expected  payoff  criterion,  the  maximum  O-fractile  criterion  (maximize 
the  a-fractile  of  the  distribution  of  f(x,3)  under  the  probability 
distribution  of  for  some  preselected  a),  and  an  aspiration 

criterion  (maximize  the  probability  of  achieving  at  least  some  pre¬ 
scribed  level  of  payoff) .  Several  propositions  are  proved  which 
relate  these  criteria  to  each  other  and  to  the  previously  mentioned 
criteria  which  do  not  involve  probabilities.  Finally,  a  one-period 
inventory  example  is  presented  to  illustrate  the  ideas  of  this  section 
and  to  support  the  suggestion  that  several  criteria,  rather  than  a 
single  one,  should  be  selected  to  embody  the  conflicting  aims  of  the 
decision-maker.  The  resulting  vector  maximum  problem  should  then  be 
solved  in  place  of  (4). 
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Alternative  Problen  Statements 


In  some  situations  the  objective  function  of  (4)  can  be  written 

^2  represent  a 

quantity  which  the  decision-medcer  wants  to  maximize,  one  may  reformu¬ 
late  (4)  as  a  two-component  vector  maximum  problem 

"Maximize"  F-,(x),  , 

X  e  X 

so  as  to  quarantine  the  part  depending  on  p.  The  advanteige  of  this 

formulation  is  that  the  decision-maker  gains  a  clearer  understanding 

of  how  his  objectives  are  influenced  by  uncertainty.  As  an  example, 

let  represent  the  immediate  payoff  of  a  multistage  decision 

problem,  and  let  Fg  represent  the  present  worth  of  the  future  payoffs, 

where  P  represents  the  future  values  of  exogenous  variables. 

Another  restatement  of  (4)  is  obtained  by  using  regret  in  place 

of  payoff.  Assiaae  that  •{  Max  f (x,p)  ]  is  achieved  for  each  p  e  B. 

X  €  X 

IQie  regret  due  to  making  decision  x  and  then  obseiving  P  is  defined 
to  be 

r(x,p)  =  [  Max  f (x,p) ]  -  f(x,p)  . 

X  €  X 

Stating  problems  in  terms  of  regret  rather  than  payoff  has  the  advantage 
of  highlighting  the  consequences  of  uncertainty  in  p  dramatically. 

In  addition,  regret  me^  have  more  tractable  mathematical  properties 
than  payoff  (assixming  that  the  indicated  maximization  operation  is 
not  overly  difficult),  due  to  non- negativity  and  sometimes  symmetry. 


When  p  is  knovm  exactly,  maximizing  payoff  is,  of  course, 
exactly  equivalent  to  minimizing  regret.  When  p  is  uncertain, 
however,  and  various  criteria  are  applied  in  order  to  arrive  at  a 
decision,  it  is  well-known  that  different  decisions  often  result 
depending  on  whether  payoff  or  regret  is  used. 

In  this  work  the  discussion  will  be  carried  on  primarily  in 
terms  of  payoff,  but  with  the  obvious  modifications  each  criterion 
can  be  applied  to  regret  as  well. 

?•!  Reformulations  not  Involving  Probabilities 

We  shall  briefly  review  a  few  classical  decision  criteria  which 
do  not  involve  probabilities.  An  example  is  given  to  illustrate  that 
it  can  be  more  useful  to  consider  several  criteria  simultaneously 
rather  than  to  search  for  a  single  ideal  criterion. 

Admissibility  Criterion 

Consider  (i+).  A  decision  x’  is  said  to  be  admissible  (with 
respect  to  X  and  b)  if  x'  e  X  and  if  there  exists  no  other 
decision  x"  e  X  such  that  f(x",p)  >  f(x’,P)  for  all  P  e  B,  with 
strict  inequality  holding  for  some  value  of  p  e  B.  If  such  a  decision 
did  exist,  it  would  be  said  to  dominate  x'  (one  may  also  define 
dcminance  by  droppirig  the  proviso  that  strict  inequality  must 
hold  for  seme  value  of  p) .  The  admissibility  criterion  requires 
that  one  choose  an  admissible  decision.  In  other  words,  if  a(x) 
is  defined  to  be  equal  to  0  if  x  is  admissible  and  equal  to  -1 
if  x  is  inadmissible,  (4)  is  reformulated  as: 

(^•l)  Maximize  a(x)  . 

X  €  X 
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Tbe  difficulties  with  this  criterion  are  twofold:  the  set  of 
adKissible  decisions  may  be  onefous  to  determine  computationally,  and 
this  set  may  be  quite  a  large  subset  of  X. 

Majoain  Payoff  Criterion 

A  conservative  decision-maker  might  invoke  the  maxmin  payoff 
criterion,  which  yields 

(1*.2)  Maximize  [  Inf  f (x,p)  ]  . 

X  €  X  P  €  B 

The  corresponding  criterion  in  terns  of  regret  is  known,  of  course, 
as  the  minmax  regret  criterion. 

Estimate  Criterion 

Ihe  estimate  criterion  requires  that  one  pick  a  value  for  p, 

A  A 

say  p,  and  then  act  as  though  P 

That  is,  solve 

(4.3)  Maximize 

X  €  X 

A. 

Since  P  buqt  be  chosen  to  be  any  point  in  B,  we  see  that  we 
really  have  a  whole  family  of  criteria. 


were  the  true  value  of  p. 


5/ 


f(x,$)  . 


^  Ihis  criterion  is  included  in  order  to  formalize  the  common  practice 
of  using  judgmental  or  engineering  approximations  to  costs  and  other 
parameters  of  decision  models.  Ihe  notion  of  an  estimate  is  related 
to  the  idea  of  a  certainty  eqtiivalent,  which  will  be  discussed  at  the 
end  of  subsection  3.2.  It  should  be  noted  that  this  criterion  may 
also  be  invoked  when  P  is  regarded  as  a  random  variable,  and  in 
fact,  the  expected  value  of  p  is  a  popular  estimate. 
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The  computational  advantages  of  this  approach  are  obvious.  It 
is  not  so  obvious  that  there  exists  a  "good"  estimate  in  B,  or 
how  to  find  one. 

The  Principle  of  Insufficient  Reason 

Assume  that  B  consists  of  a  finite  number  (k)  of  elements, 
each  denoted  by  P^.  Then  the  Principle  of  Insufficient  Reason  asserts 
that  one  should  replace  (4) 

(4.4)  Maximize 

X  e  X 

Comparison  of  Criteria 

The  above  decision  criteria  are  representative  of  the  methods 
which  have  been  proposed  in  an  effort  to  circumvent  uncertainty  in 
the  objective  function  in  the  absence  of  probabilities.  The  diffi- 
ciilties  of  selecting  one  criterion  which  satisfies  all  of  a  compre¬ 
hensive  set  of  intuitively  appealing  desiderata  for  "rational" 
decision-making  are  well-known  (see,  e.g..  Luce  and  Raiffa,  1957, 
Chapter  15) ,  and  suggest  the  futility  of  seeking  an  ideal  criterion. 

One  possible  way  out  of  this  dilemma  is  to  consider  several  criteria 
at  once,  and  thus  to  reformulate  (4)  as  a  vector  maximum  problem. 

The  actual  choice  of  a  decision  would  be  made  on  an  ad  hoc  basis 
from  the  set  of  efficient  solutions. 

Table  1  defines  a  decision  problem  in  which  there  are  four 
possible  values  of  p,  and  five  possible  decisions.  The  entries 
give  the  values  of  f(x^,^^)  and  the  consequences  of  each  possible 
decision  in  terms  of  average  payoff  (on  which  the  Principle  of 
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Insufficient  Reason  is  based)  and  in  terms  of  minimum  payoff  (on 
vhich  the  maxmin  payoff  criterion  is  based) .  Figure  J  graphs  these 
consequences. 

All  decisions  are  admissible.  The  Principle  of  Insufficient 
Reason  would  lead  to  the  choice  of  decision  number  two,  while  the 
maxmin  payoff  criterion  leads  to  the  fifth  decision.  However,  it 
seems  reasonable  to  favor  the  fourth  decision  over  any  of  the  others 
because  it  comes  very  close  to  satisfying  both  of  the  above  criteria. 

We  submit  that  by  judicious  choice  of  two  criteria  the  resulting 
vector  maximum  reformulation  of  (^4-)  can  be  expected  to  lead  to  a  more 
satisfactory  decision  than  a  single  criterion. 

3.2  Reformulations  Involving  Probabilities 

With  the  additional  assumption  that  P  may  be  regarded  as  a 
random  variable,  one  may  choose  to  regard  (4)  as  a  continuous  game 
in  normal  form.  This  viewpoint,  and  the  consequent  game- theoretic 
solutions,  will  not  be  considered  here.  Instead  it  will  be  assumed 
that  2.  ^  known  probability  distribution  u  over  B  and  so 

(U)  may  be  regarded  as  a  game  against  a  neutral  "Nature."  That  is, 
we  are  in  what  is  sometimes  known  as  a  situation  of  individual  decision¬ 
making  under  "risk. " 

The  principal  tenet  of  utility  theory  (an  excellent  summary  is 
given  in  Luce  and  Raiffa,  195T>  Chapter  2)  is  that  for  a  "rational" 
decision-maker  there  exists  a  utility  transformation  of  f,  which 
we  denote  Dy  u(f),  such  that  the  most  preferred  decision  is  an 
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optimal  solution  of: 


Maximize  E[u(f(x,p))]  . 

X  e  X  ~ 

If  one  accepts  any  of  the  sets  of  axioms  of  rational  behavior  leading 
to  this  result,  then  the  maximum  expected  utility  criterion  is  Justified 
provided  that  the  required  utility  transformation  is  at  hand. 

Unfortunately  it  may  be  very  tedious  actually  to  determine  u(f). 
For  this  reason  (and  also  because  of  certain  reservations  which  we 
have  with  regard  to  the  aocioms  of  utility  theory) ,  we  shall  consider 
other  criteria  which  can  be  applied  directly  to  f(x,p)  without  the 
need  for  a  utility  transformation.  We  begin  by  introducing  a  natural 
analog  of  the  admissibility  criterion. 

Stochastic  Admissibility  Criterion 

For  fixed  x,  u  induces  a  probability  distribution  on  f  which 
may  be  plotted  in  cimulative  form  as  in  Figure  4  (each  curve  represents 
the  cimiulative  distribution  function  of  f  corresponding  to  different 
values  of  x) .  Loosely  speaking,  one  wishes  to  perform  (4)  by  choosing 
an  x  which  determines  a  c.d.  f.  that  is  uniformly  as  low  (or,  equiva¬ 
lently,  as  far  to  the  right)  as  possible.  In  Figure  4  it  is  clear  that 
the  c.d.  f.  determined  by  must  be  strictly  preferred  to  that  of 

x^,  while  x^  need  not  be  preferred  to  Xj*  Observe  that  although 
the  probability  density  functions  determined  by  Xj^  and  x^  overlap, 
the  c.d.f. 's  do  not. 

We  formalize  the  above  ideas  in  terms  of  the  concept  of  stochastic 
dominance.  A  decision  x°  is  said  to  stochastically  dcxninate  x' 
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if  Prob[f(x°,P)  <  k]  <  Rrobtf(x',P)  <  k]  for  all  real  k,  with 
strict  inequality  holding  for  at  least  one  value  of  k  (if  we  drop 
the  proviso  that  strict  inequality  must  hold  for  at  least  one  value 
of  k,  then  we  use  the  term  weak  stochastic  dominance) .  If  a  feasible 
decision  is  not  stochastically  dominated  by  any  other  feasible  decision, 
it  is  said  to  be  stochastically  admissible.^  The  stochastic  admissi¬ 
bility  criterion  requires  that  one  choose  a  stochastically  admissible 
decision  (this  criterion  can  be  written  in  a  form  similar  to  (4.1)). 

Remark;  Although  we  do  not  choose  to  do  so  in  this  paper,  it  is 

to  strengthen  the  stochastic  admissibility  criterion  somewhat 
by  permitting  randomized  decisions  over  X.  One  would  say 
that  the  feasible  decision  x'  is  stochastically  inadmissible 
under  a  randomized  decision  strategy  if  there  exists  a  proba¬ 
bility  distribution  X  on  X  not  involving  x'  such  that 
<  k]  <  Prob  [f(x',p)  <  k]  for  all  k,  with 
strict  inequality  holding  for  at  least  one  value  of  k.  For 
example,  in  Figure  4,  Xj  is  stochastically  dominated  by 
the  randomized  strategy  which  chooses  Xg  and  3^  each  with 
a  probability  of  one-half,  even  though  neither  Xg  nor  X|^ 

stochastically  dominate  x_  alone.  Randomized  decision 

-5 

rules  have  the  effect  of  taking  vertically  convex  ccanbina- 
tions  of  the  c.d. f. 's.  It  is  clear  that  the  set  of 

37 — ^ ^ 

—  Since  stochastic  admissibility  is  defined  in  terms  of  X  and  the 
particular  distribution  to  be  precise  we  should  qualify  stochastic 

admissibility  as  being  "with  respect  to  X  and  p."  We  omit  this 
qualification  for  the  sake  of  brevity,  since  no  confusion  is  likely  to 
result  in  our  discussion. 
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stochastically  admissible  decisions  allowing  randomized 
strategies  is  contained  in  the  set  of  stochastically  admissible 
decisions  allowing  only  pure  strategies. 

We  now  explore  the  relationship  between  ordinary  and  stochastic 
admissibility. 

Proposition  1: 

Let  n  vanish  outside  of  B.  If  x°  weakly  dominates  x’, 
then  x°  weakly  stochastically  dominates  x'. 

*hat  for  all  real  k,  Prob[f(x°,p)  <  k] 

<  Prob[f(x*,3)  <  k].  By  the  definition  of  (non-stochastic)  weak 

va  h.v.  f(x’,£)  <  f(x°,3)  fox  an  g  a  B.  Thua  for  MW 
flxodvalu.  Of  fc,  f(x°,£)<k  i„pn,a  f(j',p)<B,  ao  for 

each  k  we  have 

(P  e  B;  f(x  ,p)  <  k}  ^[p  e  B:  f(x',p)  <  k]  . 

The  proposition  follows. 

Rem^:  To  see  that  the  converse  of  this  proposition  need  not  hold, 
consider  the  following  example.  X  =  [x°,x^},  B  =  (p^  P^} 
f(x°,p^)  =  f(xl,p2)  .  1,  f(^°,p2,  . 

Prob[p  =  p^J  .  ,2  o  ProbEp  .  p^J  .  .8,  ^tochaati- 

cally  d<»I„at,a  x\  but  x°  doea  not  vaalcly  doalnate  x\ 

With  additional  hypotheses,  one  may  strengthen  Proposition  1. 
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Proposition  2; 

Let  f(x,p)  be  continuous  on  B  for  each  x  e  X,  and  let 

7/ 

p  be  positive-!-'  everywhere  on  and  vanish  outside  of  B.  If 
x°  dominates  x',  then  x°  stochastically  dominates  x'. 

Proof:  From  Proposition  1  we  have  that  x°  weakly  stochasti¬ 
cally  dominates  x'*  It  remains  to  show  that  Prob[f(x°,p)  <  k*]  < 
Prob[f(x',p)  <  k*]  for  some  k*.  Since  x°  dominates  x',  there 
exists  p*  e  B  such  that  f(x°,p*)  >  f(x',p*).  Put  k*  = 
l/2(f(x°,p*)  +  f(x',p*)).  By  the  continuity  of  f  there  is  a  neigh¬ 
borhood  N*  of  p*  such  that  f(x°,p)  >  k*  >  f (x* ,p)  for  all 
P  e  N*  n  B,  and  so  by  the  positivity  of  4  on  B  we  have 
Prob[f(x°,p)  >  k*  >  f(x',p)]  >  0.  This  fact,  with  the  definition 
of  x°,  yields 

Prob[f(x',P)  <  k*]  =  Prob[f(x’,p)  <  k*  <  f(x°,p) ]  + 

Prob[f(x',p)  <  k*  >  f(x°,p) ] 

=  Prob[f(x*,p)  <  k*  <  f(x°,p)]  +  Prob[f(x°,p)  <  k*] 

>  Prob[f(x°,p)  <  k*]  . 

7/  ;  I - 1 — 

~  probability  distribution  is  said  to  be  positive  everywhere  on 
®  II  lor  each  p  e  B  then  for  every  (b-dimensional)  neighborhood 
N  of  p  the  event  [N°Ob]  has  a  non-zero  probability.  A  neigh¬ 
borhood  of  p  of  radius  p  is  defined  as  (p?-p  ) ^  <  p)  , 

and  is  denoted  by  Np(p°)  when  a  complete  notation  is  desired. 
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given  assumptions,  the  set 


Proposition  2  shows  that,  under  the 
of  stochastically  admissible  decisions  is  contained  in  the  set  of 
admissible  decisions,  as  one  would  expect  and  hope.  To  see  that  the 
set  of  stochastically  admissible  decisions  can  be  considerably  smaller 
than  the  set  of  admissible  decisions,  consider  the  example 

Maximize  [10  -  (p  -  x)^]  , 

X  e 

where  is  the  Normal  distribution  with  mean  p  and  variance  a^, 
and  B  =  R  .  Viewir.g  the  objective  function  as  a  family  of  functions 
of  P  indexed  by  x,  this  family  is  seen  to  consist  of  concave 
parabolas  which  are  identical  except  for  the  axis  of  symmetry,  which 
occurs  at  p  =  x.  Clearly  every  x°  e  R^  is  admissible,  for 

f(x  ,p  =  x°)  =  10  >  f(x,p  =  x°)  for  all  x  /  x°.  it  is  also  clear 
that  x'  ^  p  is  stochastically  inadmissible,  for  Prob[f(p,p)  <  k]  < 
Prob[f(x  ,p)  <  kj  for  all  k.  To  see  this  assertion,  observe  that 
O:  f(x,p)  >  k]  is  an  interval  of  width  2(l0-k)^/^  centered  at 
P  =  X.  By  the  symmetry  and  unimodality  of  the  Normal  distribution, 
the  interval  centered  at  P  =  p  must  include  the  greatest  probability 
for  any  k,  and  hence  Prob[f(p,p)  >  kj  >  Prob[(x>,p)  >  kj  when 
x'  7^  p,  which  is  equivalent  to  the  assertion  that  x’  7^  P  is 
stochastically  inadmissible.  Since  x  =  p  is  stochastically  admissible, 

we  see  that  on^  x  =  p  is  stochastically  admissible,  whereas  all  x 
are  admissible. 

The  Maximtun  O-Fractile  and  the  Aspiration  Criteria 

In  teimis  of  Figure  1.,  we  would  like  to  choose  a  decision  which 


achieves  the  lower  envelope  of  c.d. f. 's  everywhere.  In  general  this 
Is  impossible,  but  we  can  attempt  to  achieve  it  at  a  single  point  and 
hope  that  this  one  point  will  "pin  down"  a  c.d.f.  so  that  it  is  close 
to  the  lower  envelope.  The  point  may  be  specified  in  terms  of  its 
ordinate  or  abcissa  value,  whichever  seems  most  natural  in  a  given 
problem  context.  The  criteria  implied  by  this  idea  are,  respectively 
and  loosely: 

Criterion  F:  Choose  an  x  which  corresponds  to  a 
c.d.f.  which  approaches  the  lower  envelope  of  c.d. f. 's 
at  an  ordinate  value  of  a(0  <  a  <  l) . 

Criterion  A;  Choose  an  x  which  corresponds  to  a 
c.d.f.  which  approaches  the  lower  envelope  at  an  abcissa 
value  of  MC-®  <  M  <  ®). 


It  is  evident  that  we  have  two  entire  families  of  criteria  here,  indexed 
by  a  and  M  respectively.  Criterion  F  with  a  =  0.1  would  lead 
to  the  choice  of  Xg  in  Figure  4,  and  Criterion  A  with  M  =  20  would 


leeid  to  the  choice  of  Xi  . 


Criterion  F  is  equivalent  to  maximizing  the  O-fractile^/ of  the 
distribution  of  f(x,p)  under  p.  That  is,  it  maximizes  the  payoff 
level  below  which  there  is  at  most  an  a  probability  of  falling.-^/ 


—  We  define  the  O-fractile  of  a  (possibly  mixed)  cimulative  distri¬ 
bution  function  F(y)  =  Prob[y  <  y]  as 

Sup{k:  F(k)  <  a)  . 

9/ 

->  See  Kataoka  (I965)  for  a  linear  programming  model  of  this  type. 

It  is  one  of  the  few  published  references  to  this  criterion. 


23 


It  corresponds,  for  fixed  0  <  a  <  1,  to: 

Maximize  k 
k,x 

(^•5)  subject  to  X  e  X 

Prob[f(x,p)  <  k]  <  a  . 

When  a  is  small,  say  less  than  0.1,  this  criterion  should  appeal 
to  conservative  decision-makers  because  it  tends  to  control  the  lower 
tail  of  the  distribution  of  payoffs.  When  a  =  l/2,  (J*.5)  maximizes 

the  median  of  the  distribution  of  payoffs,  of  coiu-se.  V/e  sanetimes 
use  the  mnemonic  no^^ation  F(a)  for  this  criterion. 

Criterion  A  is  equivalent  to  maximizing  the  probability  of  exceeding 
a  prescribed  "aspiration"  level  M  of  payoff  (see  Charnes  and  Cooper, 
1965,  for  an  application  to  linear  programming).  It  corresponds  to: 

(^•6)  Minimize  Prob[f(x,p)  <  M]  . 

X  e  X  - 

We  scanetimes  use  the  notation  A(M)  for  this  criterion. 

— noted  that  all  cumulative  distribution  functions 
in  this  subsection  are  written  as  Prob[f(x,p)  <  k]  rather 
than  as  Prob[f(x,3)  <  k]  (regard  x  as  being  fixed). 

This  convention  is  followed  in  order  to  avoid  some  minor 
difficulties  which  would  be  encountered  by  these  two  criteria 
if  the  opposite  convention  were  adopted  and  the  c.d. f. 's 
were  discontinuous. 
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We  introduced  these  two  criteria  together  because  of  their  intimate 
mathematic aJ.  relationship,  as  well  as  their  common  graphiceLL  motivation. 
When  the  lower  envelope  is  attained  by  some  x  at  every  point,  and  is 
continuous  and  strictly  increasing,  it  is  geometrically  clear  that 
the  F  and  A  criteria  are  complementary  in  the  sense  that  for  every  a 
there  is  an  M  which  leads  to  the  same  set  of  decisions,  and  conversely. 
Without  such  assumptions,  however,  the  complementarity  is  weaJtened, 
as  we  shall  see  in  the  following  two  easy  propositions. 

Proxxasltion  3: 

(i)  Assume  that  criterion  F(a°)  is  satisfied  by  at  least  one 

decision.  Then  the  set  of  decisions  which  satisfy  criterion 

F(a°)  contains  the  set  of  decisions  which  satisfy  criterion 

A(M°),  where  M°  is  the  meocimtun  a°-fractile. 

(ii)  Assume  that  criterion  A(M°)  is  satisfied  by  at  least  one 

decision.  Then  the  set  of  decisions  which  satisfy  criterion 

A(M°)  contains  the  set  of  decisions  which  satisfy  criterion 

F(a°),  where  a°  =  Min  Prob[f(x,p)  <  M°]. 

X  e  X 

Proof:  (i).  Let  ^  satisfy  F(ot°) ,  and  let  be  the  maximum 

ce°-fractile.  If  satisfies  A(M°),  then  Prob[f(x°,p)  <  M°]  < 

Prob[f(x*,p)  <  M°]  <  a°,  and  so  x°  must  also  satisfy  F(a°) . 

(ii).  Let  X*  satisfy  A(M°),  and  let  a°  =  Min 

X  e  X 

Prob[f(x,p)  <  M°]  =  Prob[f(^,p)  <  M°].  If  x°  satisfies  F(a°), 
then  there  exists  k°  >  K°  such  that  Prob[f(x°,p)  <  k°]  <  a°j  since 


k°  >  M°,  we  have  Prob[f(x°,3)  <  M°]  <  Erob[f(x°,P)  <  k°]  <  a°, 
from  which  it  follows  that  x°  must  satisfy  A(M°) . 

Proposition  4: 

(i)  If  X  satisfies  criterion  F(a°)  uniquely,  then  it 

satisfies  criterion  A(M°)  uniquely,  where  M°  is  the 
maximum  a°-fractile. 

(ii)  If  X  satisfies  criterion  A(M°)  uniquely,  then  it 

satisfies  criterion  F(a°)  uniquely,  where 

a°  =  Prob[f(x°,p)  <  N°]. 

Suppose  that  x  does  not  satisfy  A(M°)  uniquely. 

Then  there  exists  x'  e  X,  x'  ^  x°,  such  that  Prob[f(x',p)  <  M°]  < 

Prob[f(x°,p)  <  M°],  which  contradicts  the  fact  that  x°  satisfies 
F(a°)  uniquely. 

(ii).  Suppose  that  x°  does  not  satisfy  F(a°)  uniquely. 

Then  there  exist  k°  >  M°  and  x’  e  X,  x'  ^  x°,  such  that 

Prob[f(x',p)  <  k°]  <  a°  =  Prob[f(x°,p)  <  M°].  Since  k°  >  M°,  we 
have  Prob[f(x',p)  <  M° ]  <  Prob[f(x',p)  <  k°],  and  so 

Prob[f(x',p)  <  M  ]  <  Prob[f(x  ,p)  <  M°].  This  contradicts  the  fact 
that  X  satisfies  A(M  )  uniquely. 

It  is  possible  for  criteria  F(a)  and  A(M)  to  lead  to  stochas¬ 
tically  inadmissible  decisions.  The  next  proposition  is  of  interest 
in  this  regard. 
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Proposition  5; 

(i)  If  x°  satisfies  criterion  F(a°)  uniquely,  then  x° 

also  satisfies  the  stochastic  admissibility  criterion. 

(ii)  If  x°  satisfies  criterion  A(M°)  uniquely,  then  x° 

also  satisfies  the  stochastic  admissibility  criterion. 

Proof:  (i).  In  view  of  part  (i)  of  Proposition  4,  to  prove  (i) 
it  is  sufficient  to  prove  (ii). 

(ii).  Let  x°  satisfy  A(M°)  uniquely,  so  that 

Prob[f(x°,P)  <  M°]  <  Prob[f(x,p)  <  M°]  for  all  x  e  X,  x  ^  x°' 

Suppose  that  x°  were  stochastically  ineuJmissible.  Then  there  would 
exist  x'  €  X,  4  x°>  such  that  Prob[f(x’,p)  <  k]  < 

Prob[  f(x°,P)  <  k]  for  all  k.  Letting  k  =  M°,  one  would  obtain 
a  contradiction. 

Now  we  turn  to  the  relationship  between  the  maxmin  payoff  criterion 
and  the  maximum  a-fractile  criterion  with  a  =  0.  It  is  not  at  all 
surprising  that  under  mild  ass^unptions  these  criteria  are  in  fact 
equivalent,  i.e.,  the  same  decisions  satisfy  both. 

Proposition  6: 

Assume  that  f(x,p)  is  upper  semicontinuousi^  on  B  for  each 
X  e  X,  and  that  p  is  positive  on  and  vanishes  outside  of  B. 

Then  the  maxmin  payoff  criterion  is  equivalent  to  the  maximum 
0-fractile  criterion. 

be  fixed  in  X.  Then  f(x,p)  is  upper  semicontinuous 
if  for  each  e  >  0  3  6  >  O  (depending  on  p°  and  e)  such 


10/ 


Let  X 


at  P  e  B 


27 


I 


Proof ;  We  shall  rewrite  (4.2)  and  (4.5)  in  such  a  way  as  to 
emphasize  their  similarity,  and  then  show  that  they  are  in  fact 
identical. 

The  maxmin  payoff  criterion  can  be  written^i/ 

Maximize  [SupTlc;  f(x,P)  >  k,  V  3  e  B}]  , 

X  €  X 

and  the  maximum  0-fractile  criterion  can  be  written 

Maximize  [Sup[k:  Prob[f(x,p)  >  k]  =  l}]  . 

X  €  X  “ 

Define  S^(x)  and  S2(x)  to  be  the  sets  appearing  in  the  first  and 
second  problems,  respectively,  for  fixed  x.  Clearly  S^(x) CIS2(x) , 

V  X  €  X,  for  M  vanishes  outside  of  B.  The  proof  will  be  complete 
when  we  show  that  S2(x)  QS^(x) ,  v  x  e  X. 

We  consider  a  fixed  x,  and  drop  the  x  arguments  from 
and  Sg.  We  may  assume  that  is  not  empty,  for  if  it  is 

empty  then  is  also  empty,  and  the  proof  is  complete.  Tsike 

k'  €  Sg-  Suppose  that  k'  ^  S^.  Then  there  exists  e  B  such 
that  f(x,3')  <  k'.  But  by  the  upper  semicontinuity  of  f(x,3)  there 
exists  a  neighborhood  N’  of  P'  such  that  f(x,P)  <  k’  for  all 

that  f(x,p)  <  f(x,p)  +e  whenever  ^eNg(P°)i  If  f  is  continuous, 
then  f  is  upper  semicontinuous.  Also,  recall  that  if  B  is  a  finite 
point  set  in  then  f(x,p)  is  automatically  continuous  on  B. 

— /  This  problem  follows  from  the  definition  of  'inf  as  the  greatest 
lower  boiind. 
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p  €  N'PJ  B.  By  the  positivity  of  p  on  B,  this  contradicts  the 
fact  that  k'  €  Sg. 

The  F  and  A  criteria  have  the  interesting  property  that  one  may 
perform  a  continuous  monotonic  transformation  on  f (x,p)  without 
altering  the  decisions  which  satisfy  these  criteria.  Hiis  certainly 
is  not  true  of  the  next  criterion  we  shall  discuss,  the  expected 
value  criterion.  We  emphasize  this  point  in 

Proposition  "J: 

Let  g(t)  be  any  strictly  increasing  and  continuous  function 
defined  from  into  R^.  Then  (i)  the  set  of  decisions  which 

satisfy  criterion  F(a)  does  not  alter  if  f(x,p)  is  replaced 

g(f(x,P)),  and  (ii)  the  set  of  decisions  which  satisfy  criterion 
A(M)  does  not  alter  if  f(x,p)  is  replaced  by  g(f(x,p))  and 
M  is  replaced  by  g(M). 

Proof:  Observe  that  f(x,p)  <  k  if  and  only  if  g(f(x,P))  <  g(k) , 
since  g  is  invertible  and  strictly  increasing.  Hence  {P:  f(x,p)  <  k)  = 
(P:  g(f(x,p))  <  g(k)),  and  so  Prob[f(x,p)  <  k]  =  Prob[g(f(x,p))  < 

g(k)].  This  yields  (ii).  To  see  (i),  write 

Sup{k:  Prob[f(x,p)  <  k]  <  a) 

=  SupCk:  Prob[g(f(x,p))  <  g(k) ]  <  a) 

=  Sup{g"^(g(k)):  Prob[g(f(x,p))  <  g(k) ]  <  a) 

=  g~\sup(g(k):  Prob[g(f(x,p))  <  g(k)  ]  <  a)) 

=  g"^(Sup{t:  Prob[g(f(x,p))  <  t]  <  a))  . 
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Finally, 


Max  [Sup{k:  Erobtf(x,p)  <  k]  <a)] 

X  €  X 

=  g"\  Max  [Sup{k:  Prob[g(f (x,p) )  <  k]  <a)])  . 

X  €  X 

Maxlminn  Expected  Payoff  Criterion 

The  F  and  A  criteria  are  designed  to  achieve  the  lower  envelope 
of  the  family  of  c.d.f. 's  {Prob[f(x,p)  <  k])..  _  ^  at  a  single  point, 
in  an  attempt  to  "pin  down"  a  c.d.f.  to  lie  "close" to  the  lower 
envelope.  Another  approach  would  be  to  use  the  area  above  the  lower 
envelope  and  below  a  candidate  c.d.f.  as  a  measure  of  "closeness." 

Criterion  E;  Choose  an  x  e  X  which  determines  the 
c.d.f.  with  the  least  area  below  it  and  above  the  lower 
envelope. 

We  shall  show  now  that  this  geranetrically  motivated  criterion 
is  equivalent  to  the  maximum  expected  payoff  criterion: 

(^•7)  Maximize  E[f(x,p)]  . 

X  e  X 

Proposition  8: 

Criterion  E  is  equivalent  to  the  maximum  expected  payoff  criterion. 

^^oof:  The  proof  is  a  simple  consequence  of  the  geometric  inter¬ 
pretation  of  the  mean  of  a  random  variable  in  terms  of  the  graph  of 
its  cumulative  distribution  function.  In  Figure  5,  the  mean  of  the 
random  variable  Y  is  area  1  minus  area  2  (see  Parzen,  i960,  p.  211). 
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Denote  by  A(x)^  the  area  corresponding  to  area  1  of  Figure  5 
for  the  c.d.f.  Rrob[f(x,p)  <  k],  and  by  A(x)“  the  area  corres¬ 
ponding  to  area  2.  Similarly,  denote  by  A^  and  A”  the  areas  above 
and  below  the  lower  envelope  of  blLI  such  c.d.f.' s.  The  the  maximum 
expected  payoff  criterion  may  be  written 

Maximize  [A(x)^  -  A(x)“]  , 

X  €  X 

and  Criterion  E  may  be  written 

Minimize  [(A(x)“  -  A“)  +  (a'*’  -  A(x)'*’)  ]  . 

X  e  X  “ 

Cle6u:ly  these  two  problems  lead  to  the  same  decisions. 


Figure  5 


There  is  an  obvious  and  fortunate  relationship  between  the  maximum 
expected  payoff  criterion  and  the  estimate  criterion  which  sometimes 
permits  one  to  choose  an  estimate  in  a  simple  way  so  that  the  estimate 
criterion  is  satisfied  by  the  same  set  of  decisions  as  the  expected 
payoff  criterion. 
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Proposition  9: 


Assvime  that  f(x,p)  can  be  written  as 


f(x,p)  =  F^(x)  +  F^CP)  +  ^  . 

Then  the  estimate  criterion  with  $  =  E[p]  Is  satisfied  by  the 
same  set  of  decisions  as  the  maximum  expected  payoff  criterion. 


Proof:  The  maximum  expected  payoff  criterion  gives 


Maximize  E  F^{x)  +  F^CP)  +  ^  > 

X  e  X  t-  i 


or 


Maximize 
X  e  X 


|^F^(x)  +  ECF^CP)]  +  ^  H^(x)  E[p^]J  . 
The  estimate  criterion  with  P  =  E[p]  gives 


Maximize 
X  €  X 


I'f^Cx)  +  PgCEEp])  +  ^  H.(x)  E[p^]J  . 


Since  the  F^  terms  of  each  problem  do  not  contain  x,  they  may 
be  deleted,  and  hence  the  two  criteria  lead  to  identical  sets  of 
decisions. 


When  the  above  proposition  applies,  we  say  that  the  estimate 
“  E[p]  is  a  certainty  equivalent  with  respect  to  the  maximum 
expected  payoff  criterion.  Other  results  in  the  same  vein  are  given 
by  Reiter  (1957),  Simon  (1956),  and  Thell  (1964). 

It  Is  easy  to  see  from  Proposition  8  that  any  decision  which 
satisfies  the  maximum  expected  payoff  criterion  must  be  stochastically 
admissible. 
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It  is  also  worth  noting  that  the  expected  value  criterion  leads 
to  the  same  decisions  when  applied  to  payoff  as  when  applied  to  regret. 
In  general  this  is  not  true  for  criteria  A(M)  and  F(a) . 

3.5  An  Example 

We  present  a  simple  Inventory  model  as  an  Illustration  of  the 
ideas  of  this  section  and  as  a  vehicle  for  further  discussion.  Consider 
a  firm  stocking  and  selling  a  single  ccmmodity  for  a  single  period  of 
time.  We  use  the  notation 


X  =  nianber  of  units  to  be  ordered  in  advance  of  the 
demand 

P  =  imknown  demand  level  d\u*ing  the  period 
c  =  cost  per  unit 
r  =  revenue  per  unit  (r  >  c) 

V  =  salvage  value  per  unit  left  at  end  of  period  (v  <  c) 
f(x,p)  =  total  profit 
X  =  [0,«) 

B  =  where  is  chosen  sufficiently  large 

to  account  for  the  largest  likely  demand 


The  payoff  and  regret  are  given  by 


f(x,p) 


r(x,p) 


^  (r-c)p  -  (x-p)(c-v) 

if 

P  <  X 

(r-c)x 

if 

P  >  X 

01 

1 

> 

1 

0 

if 

P  <  X 

((r-c) (p-x) 

if 

P  >  X  . 
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First  we  examine  the  criteria  not  involving  probabilities  over 

the  set  of  possible  demand  levels.  All  choices  for  x  e  X  are  readily 

seen  to  be  admissible.  The  maxmin  payoff  criterion  leads  to  the  decision 

to  order  zero  units,  since  Min  f(x,p)  =  -(c-v)x.  When  this  criterion 

P  e  B 

is  applied  to  regret,  however,  it  (minmax  regret)  leeids  to  the  decision 
to  order  ^  This  is  the  same  decision  that  the 

Principle  of  Insufficient  Reason  woiild  give  if  we  interpret  it  as 


putting  a  uniform  distribution  over  estimate  criterion 

leads  to  a  trivial  maximization  problem  once  an  estimate  %  is  chosen, 
and  indicates  that  we  should  order  exactly  x  =  ^. 

Next  we  examine  the  criteria  involving  probabilities  over  the  set 
of  possible  demand  levels.  In  order  to  plot  the  cumulative  distri¬ 
butions  of  payoff  for  various  candidate  x's,  we  need  to  know  the 
set  of  P's  for  which  the  payoff  is  less  than  k. 


0 

if 

k  < 

o 

1 

(P:  P  >  0,  f(x,p)  <  k}  =( 

if 

-(c- 

■v)x  <  k  <  (r-c)x 

[o,») 

if 

k  > 

(r-c)x  . 

Using  the  fact  that  x  is  non- negative,  we  have  for  k  >  0 


if  X  < 


Prob[f(x,p)  <  k]  = 


Cr-c) 


1  -/  dM  if  X  > 

■^k+(c-v) 


i-(c-v;x 


k 
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0 


For  k  <  0, 


Prob[f(x,p)  <  k] 


if 


X  < 


-k 

(c-v) 


-X 


k+(c-v)x 
(r-vi 


dU 


if 


X  > 


-k 


The  lower  envelope  may  be  obtained  by  solving,  for  all  real  k, 
the  problem 

Minimize  Prob[f(x,p)  <  k]  . 

X  >  0 


This  problem  has  a  very  simple  solution  for  this  example.  For  k  <  0, 
the  minimum  is  zero  and  is  achieved  for  0  <  x  <  |k|/(c-v).  For  k  >  0, 

the  minimum  is  1  -  /  dji  and  is  achieved  for  x  =  k/(r-c). 

Jk/(r-c) 

Ass\jme  for  computational  simplicity  that  the  demand  is  exponen¬ 
tially  distributed  with  meeui  10,  that  (c-v)  =  1/2,  and  that 
(r-c)  =  5/2.  Then  for  k  >  0,  the  lower  envelope  has  height 
[1  -  exp[-.066E  k]],  and  is  achieved  at  x  =  2k/5.-^  Figure  6 
illustrates  the  lower  envelope  and  a  few  sample  c.d. f. 's.  Observe 
that  each  c.d. f.  jumps  to  the  value  1  as  soon  as  it  attains  the  lower 
envelope,  and  that  every  x  >  0  is  stochastically  admissible. 

We  are  now  in  a  position  to  read  off  the  "optimal"  decisions 
corresponding  to  criteria  A(m)  and  F(a)  for  any  choice  of  M 
or  a.  A(M°)  leads  to  the  \mique  choice  of  x  =  M°/(r-c),  and 
F(a°)  leads  to  the  unique  choice  x  =  -10  In(l-a^).  In  this 

127 

—  Note  that  the  lower  envelope  is  the  c.d. f.  of  an  exponential 
distribution  with  meeui  15. 
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particular  example,  these  criteria  do  not  fulfill  their  promise  of 
"pinning  down"  a  c.d.f.  to  lie  close  to  the  lower  envelope,  because 
each  c.d.f.  is  discontinuous  at  the  point  at  which  it  achieves  the 
lower  envelope. 

The  maximum  expected  payoff  criterion  may  be  applied  by  setting 
the  derivative  of  E[  f (x,p)  ]  equal  to  zero  and  solving  for  x. 

This  computation  leads  to  the  well-known  (Dvoretzsky,  Kiefer,  and 
Wolfowitz,  1952)  result  that  one  should  choose  the  value  of  x 
corresponding  to  the  (r-c)/(r-v)-th  fractile  of  n.  That  is, 

f‘X* 

X*  should  satisfy  /  dji  =  (r-c)/(r-v).  For  the  data  assumed  above, 

Jo 

=  13.8.  It  is  Interesting  to  observe  that  if  ja  were  uniform 

then  the  minmax  regret  criterion  would  lead  to  exactly 
the  same  action  as  would  the  maximum  expected  payoff  criterion. 

Next  we  coxTy  out  a  parallel  analysis  in  terms  of  regret  rather 
than  payoff  •  It  will  be  seen  that  A(M)  and  F(0£)  are  more 
appealing  when  applied  to  the  regret  distributions.  An  argument  will 
be  presented  for  choosing  a  value  of  x  other  than  that  which  mini¬ 


mizes  expected  regret  (which,  of  course,  is  equivalent  to  maximizing 
the  expected  payoff,  the  now  classical  solution  to  this  problem) . 


We  have,  for  k  >  0, 


J:  P>0,  r(x,p)<k)=/^  J 


k 

(c-v) 


Since  we  are  dealing  in  terms  of  regret,  rather  than  payoff,  we  seek 
the  upper  envelope  rather  than  the  lower  envelope.  It  is  obtained  by 
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maximizing. 


for  all 


k  >  0, 


Maximize 
X  >  0 


Prob[r(x,p)  <  k]  • 
k 


f 

^Max 


r-c 


{o. 


X  - 


c-v 


since  the  exponential  distribution  is  monotone  decreasing,  the  maximum 
is  easily  seen  to  be  achieved  at  x  =  k/(c-v).  The  height  of  the 
upper  envelope  is  therefore  equal  to  Prob[p  <  k/(c-v)  +  k/(r-c)]. 

For  the  data  given  previously,  this  quantity  is  computed  to  be 
[l-exp(-0. 26^  k)  ],  and  the  upper  envelope  is  achieved  for  x  =  2k. 

Figure  7  is  the  counterpart  of  Figure  6.  Note  that  the  c.d.f. 's 
are  continuous,  so  that  A(M)  and  F(o<)  are  more  effective  in  their 
endeavor  to  "pin  up"  a  c.d.f.  to  lie  near  the  upper  envelope. 

For  a  given  value  of  x,  it  is  a  straightforward  matter  to  calcu¬ 
late  the  expected  regret  and  the  a-fractile.  This  has  been  done 
for  a  =  .95  and  some  representative  values  of  x  in  Figure  8.  The 
striking  feature  of  this  graph  is  that  large  relative  changes  in  .95- 
fractile  are  available  with  only  small  relative  changes  in  expected 
regret,  with  the  result  that  it  becomes  attractive  to  deviate  from 
the  ordinary  minimum  expected  regret  solution  to  the  problem.  For 
example,  consider  x  =  15.8  (which  yields  the  minimvun  expected  regret) 
in  comparison  with  x  =  20.  The  former  has  an  expected  regret  of 
6.9  and  a  .95-fractile  of  24.1,  whereas  the  latter  has  an  expected 
regret  of  7*7  ^•nd  a  .95-fractile  of  l4.8.  That  is,  by  choosing  x  =  20 
instead  of  15.8,  one  may  achieve  a  38.5^  decrease  in  .95-fractile  at 
the  expense  of  only  11.6^  Increase  in  expected  regret;  for  x  =  I8 
instead  of  13.8,  the  percenteiges  beccane  26.15t  and  5.9^6. 
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This  example  shows  a  special  instance  of  what  is  likely  to  be 
a  quite  general  situation:  in  the  neighborhood  of  the  decision  indicated 
by  the  mnYimum  expected  payoff  criterion,  it  is  possible  to  substan¬ 
tially  improve  the  o-fractile  or  aspiration  levels  of  payoff  or  regret 
without  lowering  the  expected  payoff  very  much.  Such  possibilities 
ought  to  be  Investigated  and  exploited  when  fovind  to  be  relevant  to 
the  decision-maker's  objectives. 

3.4  Vector  Maximum  Reformulations 

The  "ideal"  decision  criterion  is  analagous  to  the  much-sought 
philosophers'  stone  of  medieval  times,  and  seems  about  as  likely 
to  exist.  We  suggest  that  one  mi^t  profitably  consider,  in  a  given 
application,  two  or  even  three  plausible  criteria  (not  necessarily 
the  ones  discussed  herein)  and  reformulate  (4)  as  a  vector  maximum 
problem.  The  solution  of  this  vector  maximum  problem  would  reveal 
clearly  the  tradeoffs  involved  between  the  criteria,  and  a  decision 
may  be  chosen  in  an  ad  hoc  manner  fran  the  efficient  candidates.  For 
example,  if  a  situation  such  as  Fig\ire  9  occurs,  one  would  probably 
choose  £in  efficient  solution  nearer  to  point  B  than  to  point  A,  for 
a  large  gain  in  criterion  2  can  be  achieved  at  the  expense  of  a  rela¬ 
tively  small  loss  in  criterion  1. 


Figure  9 
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Criterion  1 
(to  be  maximized) 


One  combination  of  criteria  which  seems  particularly  plausible 
when  a  probability  distribution  over  B  is  available  is  the  OC-fractile 
criterion  with  the  expected  value  criterion.  With  a  small,  the 
first  criterion  tends  to  control  the  lower  tail  of  the  distribution 
of  payoffs,  while  the  second  tends  to  control  the  mean.  Such  a  com¬ 
bination  might  be  used  to  program  a  mutual  investment  fund,  for  example, 
for  the  possibility  of  ruin  or  large  losses  seems  to  loom  as  a  separate 
dimension  of  utility  from  the  average  growth  rate.  Markowitz  (I956) 
had  precisely  this  viewpoint  in  mind  for  his  well-knowii  portfolio 
problem,  except  that  he  used  variance  in  place  of  the  a-fractlle. 

Hodges  and  Lehmann  (1952)  proposed  essentially  this  combination 
of  criteria,  except  that  they  took  a  equal  to  zero.  Letting  a 
rise  above  zero  seems  to  avoid  some  of  the  excessive  conseirvatism  in 
their  formulation,  while  keeping  the  aim  of  protection  against  large 


CHAPTER  II 


Reducing  a  Vector  Maximum  Problem  to  a 
Parametric  Programming  Problem 

In  this  chapter  it  is  assumed  that  uncertainty  has  been  removed 
from  a  decision  problem  by  means  of  devices  such  as  those  discussed 
in  the  first  chapter,  and  that  it  is  desired  to  solve  the  vector 
maLximum  problem, 

(l)  "Maximize"  f(x)  > 

X  €  X 

where  f(x)  =  (f^(x) , . . .  ,f^(x) ) ,  x  is  an  n- vector,  and  X  is 
a  given  set  of  feasible  decisions.  Recall  that  "solving"  (l)  means 
finding  all  efficient  decisions,  where  a  feasible  decision  x°  is 
called  efficient  if  there  exists  no  feasible  decision  x'  such  that 
f(x')  >  We  shall  discuss  two  ways  of  reducing  (l)  to  a 

parameterized  family  of  ordinary  (one  criterion  function)  mathematical 
programming  problems,  or  "parametric"  programming  problems.  Existing 
computational  methods  for  these  problems  will  be  indicated. 

This  chapter  is  intended  to  serve  as  a  bridge  between  the  study 
of  decision  problems  under  uncertainty,  which  was  the  topic  of  the 
first  chapter,  and  the  study  of  a  class  of  algorithms  for  parametric 
programming,  which  is  the  topic  of  the  third  chapter. 

— ^  Recall  that  by  this  notation  we  mean  f^(x')  >  f^(x°)  (i  =  l,...,r) 
with  f^(x')  >  f^(x°)  for  some  i  (see  Footnote  1,  Chapter  l) . 
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1. 

From  the  definition  of  an  efficient  decision  for  (l),  it  is 
easy  to  see  that  a  feasible  decision  x°  is  efficient  if  and  only  if 
x°  is  an  optimal  solution  to  each  of  the  r  problems 

Maximize  f.(x) 

(2i)  X  e  X  ^ 

subject  to  f^(x)  >  f^(x°)  ,  j  =  1,  ,  r  but  j  ^  i  , 

1  =  l,...,r.  It  follows  immediately  that  the  following  assertion 
holds. 

Proposition  1; 

1  <  io  ^  If  x°  is  efficient  in  (l),  then 

there  exists  an  (r-l) -vector  6  such  that  x°  is  an  optimal 
solution  of  (3iQ),  where  (Ji)  is  given  by 

Maximize  f . (x) 

(3i)  iE  «  X  ^  - 

subject  to  f^jC  x)  >  6^  ,  j  =  1,  ...  ,  r  but  j  ^  i  . 

This  proposition  suggests  a  method  for  finding  all  efficient 
decisions.  Taking  r  =  2  and  i^  =  1,  for  example,  we  find  the 

set  of  all  efficient  decisions  among  the  totality  of  optimal  solutions 
to 

Maximize  f  (x) 

(3)  X  e  X  ^  - 

subject  to  >  6. 

as  8  varies  over  (_oo,+oo).  o-ftpn  -p  ^  ^  ^  ^ 

\  >  uinen  ip'iE'  bounded  from  above 
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on  X,  and  so  the  interval  of  parametric  variation  does  not  extend 

to  +“.  Likewise  when  fgC^)  is  bounded  from  below  on  X,  or  when 

the  maximum  of  oii  X  is  achieved  for  some  value  of  x,  the 

interval  of  parametric  variation  need  not  extend  to  -<». 

This  method  yields  not  only  all  efficient  decisions,  but  possibly 
some  inefficient  ones  as  well,  since  it  may  be  possible  to  increase 
fg(x)  without  decreasing  fj^(x)  below  its  maximum  value  for  a  parti¬ 
cular  value  of  6.  A  similar  ranark  holds  a  fortiori  for  r  >  2. 

Culling  out  the  inefficient  decisions  when  r  =  2  is  easily  done, 
in  principle,  by  viewing  the  graph  of  (f^(x)  ,f2(x))  for  all  candi¬ 
date  decisions  generated  by  the  method.  For  r  >  2,  graphical  analysis 
rapidly  becomes  impractical,  and  one  must  rely  on  sufficient  conditions 
such  as  those  given  in 

Proposition  2; 

Let  1  <  <  r  and  the  (r-1) -vector  6^  be  fixed,  and  let 

o  ,  . 

X  be  an  optimal  solution  to  (Ji  )  with  6=6.  If  any  of 

—  '  o  —  — o 

the  following  three  conditions  are  satisfied,  then  x°  is 

efficient  in  (l). 

(i)  x°  also  an  optimal  solution  of  the  r-1  problems 
(3i),  i  i  with  6j  =  J  =  l,...,r. 

(li)  x°  is  the  unique  optimal  solution  to  (31^)  with 

6=6. 

—  — o 

(ill)  x°  is  the  unique  optimal  solution  to  (31^)  with 
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Proof:  If  (i)  is  satisfied,  x°  is  efficient  in  (l)  by  the 
opening  remark  of  this  section. 

Assume  that  (ii)  is  satisfied,  and  suppose  that  x°  is  not 
efficient.  Then  there  exists  x'  e  X  such  that  f(x')  ^  > 

which  implies  that  x*  is  feasible  and  optimal  in  (Ji^)  with 
6  =  8^,  thus  contradicting  the  unique  optimality  of  x°.  Hence 
x°  is  efficient. 

Since  x°  also  is  an  optimal  solution  of  (3i  )  with  8.  =  f.(x°), 
the  argument  apropos  (ii)  applies. 

Under  additional  hypotheses.  Propositions  1  and  2  can  be  combined 
to  give 

Proposition  3* 

Let  1  5  ^  fixed.  Assiame  that  fi^  is  strictly  concave, 

^  Iq)  is  concave,  and  X  is  convex.—^  Then  x°  is 
efficient  in  (l)  if  and  only  if  x°  solves  (Ji^)  for  some  (r-1)- 
vector  8. 

Proof :  Necessity  was  proven  in  Proposition  1.  To  prove  suffi¬ 
ciency,  apply  A. 2  of  Appendix  A  and  part  (ii)  of  Proposition  2. 

2.  Reducing  (l)  to  a  Problem  Parametric  in  the  Objective  Function 
We  shall  give  sane  conditions  under  which  (l)  can  be  reduced  to 

27  !  . 

—  See  Appendix  A  for  definitions  of  convex  sets  and  concave  functions, 
and  some  properties  thereof  which  will  be  used  freely  in  the  sequel. 
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a  family  of  problems  of  the  form 


r 

(4)  Maximize  ^  v. f. (x)  , 

X  e  X  i=l  ^ 

where  v  >  0  is  a  vector-valued  parameter. 

Proposition  4: 

(i)  If  V  ^  ®  optimal  solution  to  (4),  then 

x°  is  efficient  in  (l). 

(ii)  If  V  >  0  and  x°  is  the  imique  optimal  solution  of 
(4),  then  x°  is  efficient  in  (l). 

Proof;  Suppose  that  (i)  is  false.  Then  there  exists  x'  ^  ^ 
such  that  f(x')  ^  >  since  0,  this  implies  that 

2)v^f^(x’)  >  51)  v^f^ (x°) ,  thus  contradicting  the  optimality  of  x°' 
in  (4).  This  proves  (i). 

Suppose  that  (ii)  is  false.  Then  there  exists  x'  e  X,  ^ 
such  that  f(x')  >  f (x°) ;  since  v  >  0,  this  implies  that 
2)v^f^(x')  thus  contrewiicting  the  unique  optimality 

of  x°  in  (4).  This  proves  (ii). 

5/ 

Proposition 


Let 

X 

be  convex,  let 

fi(x) 

be  concave,  i  =  1, ...,r. 

and 

let 

o 

X 

be  efficient 

in  (l). 

Then  there  exists  an  r- vector 

V  > 

0 

such  that  x° 

is  an 

optimal  solution  of  (4)  with 

o 

V  V  • 

— ^  The  earliest  statement  and  proof  of  a  theorem  of  this  type  seems  to 
be  due  to  Kuhn  and  Tucker  (1951) .  An  elegant  proof  of  this  proposition 
has  been  given  by  Karlin  (1959,  p.  217).  For  the  sake  of  completeness 
we  record  a  slightly  different  version  of  that  proof  here. 
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Rioof:  Put  P={peE^:  p>f(x°)).  Clearly  P  is  convex. 
Put  Z  =  (z  e  E^:  z  <  f (x)  for  some  x  e  X).  Z  is  convex,  for 
let  _z  ,_z  e  z  and  let  0  <  \  <  l.  By  the  definition  of  Z  there 
exist  x',x''  e  X  such  that  z'  <  f(x')  and  z"  <  f(x").  Hence 

(Xz-  +  (l-x)z")  <Xf(x')  +  (l-x)f(x")  <f(Xx'  +  (l-X)x’') 

where  the  last  inequality  follows  from  the  concavity  of  f (x) .  Since 
(Xjc'  +  (l-X)x")  e  X  by  the  convexity  of  X,  (Xz*  +  (l-x)z")  e  Z. 
This  shows  thcL't  Z  is  convex* 

Because  is  efficient,  zOP  is  the  single  point  f  (x°) , 
so  that  Z  and  P  have  no  interior  points  in  common.  Hence  we  may 
apply  the  well-known  Theorem  of  the  Separating  H^erplane  (see  A.?, 
Appendix  A)  to  assert  the  existence  of  an  r-vector  v°  ^  0  and  a 
scalar  c  such  that 


5  c  <I)v°  p.  ,  Vz  e  z,  pep 


The  right-hand  inequality  and  the  definition  of  P  imply  that 
v°  >  0,  for  otherwise  the  sum  v°  p.  would  be  unbounded  from 
below.  By  the  definition  of  Z,  the  left-hand  inequality  yields 
S^i  <  c,  Vx  e  X.  Taking  p  =  f  (x°) ,  we  have  ^  v?  f.(x)  < 

^Vi  ffCx  ),  Vx  e  X,  which  is  equivalent  to  the  assertion  that 
X  is  an  optimal  solution  of  (4)  with  v  =  v°. 


When  the  hypotheses  of  Proposition  5  hold,  one  is  sure  to  find 
all  efficient  decisions  for  (l)  among  the  totality  of  optimal  decisions 
for  (4)  as  v  ranges  over  all  non-negative  values.  Notice  that 
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without  loss  of  generality  one  may  take  =  1  in  (4),  since  for 

fixed  ^  >  0  the  objective  function  of  that  problem  can  be  scaled 
by  a  factor  of  l/  without  affecting  the  set  of  optimal  solutions. 

Hence  v  is  really  only  an  (r-l) -dimensional  parameter.  When  r  =  2, 
for  example,  (k)  reduces  to  the  parametric  problem 

(4.1)  Maximize  vf  (x)  +  (l-v)  f-(x)  for  each  0  <  v  <  1  . 

xeX-^  -- 

By  strengthening  the  hypotheses  of  Proposition  5,  the  last  two 
propositions  can  be  combined  to  give 

Proposition  6; 

Let  X  be  convex,  and  let  f^(x)  (i  =  1, . . . ,r)  be  strictly 

concave.  Then  x°  is  efficient  in  (l)  if  and  only  if  x° 

solves  (4)  for  some  v  >  0. 

^^oof:  Necessity  was  proven  in  Proposition  5.  To  prove  sufficiency, 
apply  A. 2,  A.4,  and  peirt  (ii)  of  Proposition  4. 

3.  Computational  Methods  for  Parametric  Problp"if! 

A  very  common  approach  for  a  decision-maker  to  take,  when  faced 
with  solving  a  multi-criterion  problem  such  as  (l),  is  to  reformulate 
(1)  in  the  form  of  (Ji)  or  (4)  (or  possibly  a  combination  of  the  two) 
with  8  or  V  fixed  at  some  value  of  particular  interest.  Problem 
(3i)  corresponds  to  selecting  and  retaining  the  most  important  criterion 
function  and  putting  the  rest  in  as  constraints  so  that  the  remaining 
criteria  each  meet  at  least  some  minimally  acceptable  level.— ^ 

TJ7 - ^ - 

^  important  example  of  this,  see  Neyman  and  Pearson 

U955;,  who  employed  this  device  as  a  cornerstone  of  their  theory  of 
statistical  hypothesis  testing. 


Problem  (4)  corresponds  to  maximizing  a  weighted  combination  of  criteria 
which  is  designed  to  reflect  the  relative  importance  of  each.  Such 
an  approach  offers  computational  simplicity  in  comparison  with  a 
complete  solution  of  (l),  since  just  one  ordinary  maximization  problem 
has  to  be  solved.  After  (Ji)  or  (4)  has  been  solved  for  the  selected 
6  or  V  ^  the  value  of  6  or  v  may  be  varied  in  a  neighborhood 
of  t°  or  v°  in  order  to  ascertain  how  the  corresponding  optimal 
decisions  and  payoff  function  vary.  This  is  a  type  of  "sensitivity 
analysis."  The  above  propositions  relate  this  type  of  sensitivity 
analysis  to  the  partial  solution  of  (l)  in  the  vector  maximum  sense. 

Whether  for  purposes  of  sensitivity  analysis  or  of  solving  (l), 
solution  methods  are  required  for  the  parametric  problems  associated 
with  (51)  and  (4).  Since  analytic  methods  can  be  expected  to  have 
very  limited  applicability— if  experience  with  non-parametric  mathe¬ 
matical  programming  is  any  guide- numerical  methods  must  be  employed. 

In  this  regard,  we  are  obliged  to  limit  our  consideration  to  problems 
for  which  X  is  convex  and  f.(x)  (i=l,...,r)  is  concave,  for 

most  known  programming  algorithms^/  require  at  least  convexity  of 
the  feasible  region  and  concavity  of  the  objective  function.  We  shall 
further  limit  our  consideration  to  the  important  case  r  =  2,  because 
the  vastness  of  the  parameter  space  increases  so  rapidly  with  r  as 
to  preclude  the  reasonable  hope  of  solving  parametric  problems  even 
to  reasonable  approximation  when  r  is  much  larger  than  2  or  3. 

For  s^yeys  of  (nonlinear)  programming  algorithms  see  e  s 
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We  now  Indicate  some  existing  computational  methods,  and  point  out 
the  need  for  the  develojments  of  the  next  chapter. 

If  X  is  a  convex  polyhedron  (i.e.,  the  feasible  region  is 
determined  by  a  set  of  linear  equalities  or  inequalities),  then  several 
efficient  parametric  programming  algorithms  are  available  for  certain 
special  classes  of  criterion  functions:  when  f^  and  f^  are  both 
linear  functions,  parametric  versions  of  (j)  and  (4.1)  can  be  solved 
by  parametric  linear  programming  (Gass,  1955) >  when  f^  is  linear 
and  fg  is  a  quadratic  polynomial,-^  the  algorithms  of  Houthakker 
(i960),  Markowitz  (I956),  and  Wolfe  (1959)  are  available  when  f^ 
and  fg  are  both  quadratic  polynomials,  an  algorithm  of  Zahl  (1964) 
essentially  solves  (4.1),  although  it  seems  possible  to  improve  upon 
the  efficiency  of  his  procedure  by  utilizing  the  developments  of  the 
next  chapter.  Little  if  anything  appears  to  have  been  done  to  devise 
efficient  algorithms  for  parametric  problems  involving  more  general 
classes  of  criterion  functions  or  feasible  regions  other  than  convex 
polyhedra.  The  class  of  algorithms  developed  in  Chapter  III  is 
intended  as  a  contribution  in  this  direction.  At  the  present  state 
of  the  art  of  parametric  programming,  however,  one  must  fall  back 
upon  more  rudimentary  methods. 

In  principle,  if  an  algorithm  is  available  which  will  solve 
(3i)  or  (4)  for  any  ptirticular  value  of  the  parameter,  then  by 

^  That  is,  f^(^)  =  X  ^  ^  jc,  where  t  denotes  transpose  and  Q 

is  a  negative  semidefinite  matrix. 

7/ 

— '  See  also  Boot  (1963a,  1963b). 
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employing  a  suitably  fine  grid  of  parameter  values  one  can  obtain  a 
discrete  approximation  to  the  optimal  solutions  of  the  parametric 
problem.  This  is  a  very  straightforward  approach,  and  for  many 
problems  it  may  be  fairly  practical,  since  the  optimal  solution  for 
one  parameter  value  can  be  expected  to  provide  a  nearly  optimal 
solution  at  the  next  parameter  value  on  the  grid.  Because  most 
programming  algorithms  may  be  viewed  as  gradient  methods,  this 
approach  should  provide  roughly  first  order  convergence  between 
optimal  solutions  at  adjacent  pairs  of  grid  points. 

In  the  next  chapter  we  offer  an  alternative  to  the  last  approach 
under  quite  general  assumptions  on  the  criterion  functions  and  the 
feasible  region.  We  shall  develop  a  class  of  algorithms  for  solving 
(iv.l),  a  main  member  of  which  exhibits  second  order^/  convergence 
between  adjacent  pairs  of  grid  points. 


A  sequence  <  x  >  which  converges  to 
order  convergence  if  the  norm  of  the  error 
asymptotically  proportional  to  the  (square 
at  the  n-lst  step  (see  Appendix  C,  section 


x°  exhibits  first  (second) 
at  the  n-th  step  is 
of  the)  norm  of  the  error 
1). 
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CHAPTER  III 


A  Class  of  Algorithms  for  Parametric  Concave  Programming 
1.  Introduction  and  Preliminaries 

In  this  chapter  we  present  a  class  of  algorithms  for  solving  parametric 
concave  programming  problems  of  the  form 

Maximize  af^(x)  +  (1-Ct)f2(x) 

(Kt)  - 

subject  to  ^(x)  ^  0 

for  each  a  e  [0,1],  where  x  is  an  n-vector,  f^Cx)  (i  =  1>2)  is 
strictly- concave,— and  each  component  function  of  £(x)  =  >  •  •  •  >gjj(x) ) 

is  concave.  Certain  additional  regularity  requirements  are  detailed  in 
subsection  2.1. 

Since  our  topic  is  parametric  programming,  rather  than  ordinary 
(non-parametric)  mathematicsLl  programming,  we  shall  further  assume 
that  an  optimal  solution  of  (Pa)  is  available  for  some  value  of  a 
in  the  unit  interval.  This  assumption  is  in  fact  not  restrictive, 
for  it  is  shown  in  subsection  1.1  that  a  parametric  programming  algorithm 
for  (Pa)  which  requires  an  optimal  solution  for  some  value  of  OC 
in  order  to  "get  started"  can  itself  be  used  to  generate  such  an 
optimal  solution. 

^  The  algorithms  to  be  given  still  apply  if  (in  the  following,  e  >  0 
is  arbitrarily  small) :  (a)  f^^  is  strictly  concave  and  f^  is  (non-  ■  . 

strictly)  concave  and  [0,l3  is  replaced  by  [e,l],  or  (b)  f^^  is 
concave  and  f^  is  strictly  concave  and  [0,1]  is  replaced  by 
[0,l-e],  or  (c)  af^  +  (l-a)f2  is  strictly  concave  for  each  fixed 
a  €  (0,l)  and  [0,l]  is  replaced  by  [e,l-e]. 
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The  remainder  of  this  section  motivates  (iCt)  and  the  present 
class  of  algorithms:  in  subsection  1.1  it  is  noted  that  (iCt)  substimes 
the  vector  maximum  problem  for  two  criterion  functions  and  also  the 
standard  (non- parametric)  concave  programming  problem,  and  in  sub¬ 
section  1. 2  the  Kuhn- Tucker  Theorem  for  nonlinear  programming  is 
presented  in  slightly  unconventional  form  so  as  to  display  clearly 
the  foundation  upon  which  the  present  class  of  algorithms  is  built. 

Section  2  is  devoted  to  presenting  and  proving  a  Basic  Conceptual 
Algorithm  for  solving  (R3t)  for  each  value  of  Ct  in  the  unit  interved.. 
Three  graphical  examples  are  given  in  Appendix  B.  The  development 
of  this  conceptual  algorithm  into  a  Basic  Computational  Algorithm,  via  the 
use  of  Newton's  method  for  solving  the  relevant  systems  of  equations, 
is  the  subject  of  section  3*  Scane  necessary  computational  devices  are 
recorded  in  Appendix  C.  Section  k  hosts  a  modification  (more  accurately, 
a  completion)  of  the  algorithms  aimed  at  improving  their  efficiency. 

Tsiro  extensions  are  indicated  in  section  5:  the  adaptation  of  the 
present  algorithms  to  handle  linear  equality  constraints,  and  the 
possibility  of  solving  more  general  kinds  of  parametric  problems  than 
(Ba). 


1.1  Motivation  of  (Ba) 

One  motive  for  studying  (B3()  was  given  in  Chapter  II.  From 
Proposition  6  of  that  chapter,  which  applies  because  of  the  above 
assumptions,  solving  (B3t)  for  all  0  <  CK  <  1  is  exactly  equivaJ.ent 
to  solving  the  vector  meocimum  problem 

"Maximize"  f  (x) ,  fp(x)  subject  to  g(x)  >0  . 


(1) 


That  is,  every  efficient  decision  for  (l)  is  an  optimal  solution  of 
(Rx)  for  seme  0  <  a  <  1,  and  conversely. 

Another  reason  for  studying  (Kx)  is  that  it  subsumes  the  standanrd 
problem  of  concave  programming.  Suppose  that  it  is  desired  to  solve 

(2)  Maximize  F(x)  subject  to  g(x)  >  0  , 

X 

where  F(x)  is  strictly  concave  and  the  constraint  functions  ar-e  all 
concave.  If  x°  is  any  feasible  decision  whatsoever  of  (2) ,  put 
( BDt)  equal  to 

n  Q  2 

Maximize  0(F(x)  +  (l-a)  (-l)  (x.  -  x.) 

X  “  1 

(3a) 

subject  to  £(x)  >  0  . 

Then  x°  clearly  is  the  optimal  solution  of  (5o),  and  (3a)  satisfies 
the  assumptions  required  of  (Rx)  in  the  opening  paragraph.  Applying 
an  algorithm  for  parametric  concave  programming  to  (3a)  beginning 
with  a  =  0  and  increasing  a  until  a  -  1,  one  obtains  the  optimal 
solution  to  (3q^),  which  is  identical  to  (2).  Hence  a  pareunetric 
algorithm  for  (I0()  provides  a  "deformation"  method  of  concave  pro¬ 
gramming. 

Problem  (Ja)  is  capable  of  an  interesting  interpretation,  which 
we  shall  now  sketch  briefly.  Consider  an  enterprise  currently  "operating" 
at  the  (feasible)  point  x°,  with  a  single  criterion  function  F(x) 
and  a  feasible  operating  region  (x:  g(x)  >  O}.  Due  to  conservatism, 
or  a  desire  to  avoid  disrupting  the  operations  of  the  enterprise 
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radically,  or  to  a  desire  to  hedge  eigainst  the  risk  of  a  faulty  decision 
model,  assiame  that  the  managers  of  the  enterprise  prefer  to  adjust  the 
operating  point  gradually  frcan  x°  toward  x*,  where  ^  is  optimal, 
in  (2).  If  the  managers  have  a  quadratic  loss  function  ^  (x^-x°)^ 
associated  with  deviations  from  x°,  the  optimal  solution  to  (3a) 
as  a  varies  from  O  to  1  gives  6ui  optimum  path  from  x°  to  x*. 

Since  (la)  for  fixed  a  is  of  the  form  (2),  the  device  repre¬ 
sented  by  (30!)  can  be  used  to  find  a  starting  optimal  solution  to 
(Fa)  if  one  exists  (providing  that  a  feasible  decision  is  known),  so 
that  the  assumption  stated  in  the  introduction  is  not  restrictive, 
as  asserted. 

Of  course,  in  place  of  (3a)  one  could  use 

Maximize  aF(x)  +  (l-a)H(x) 

(Ua)  ^ 

subject  to  ^(x)  >  0  , 

where  H(x)  is  a  strictly  concave  function  with  a  known  maximum 
over  the  feasible  region. 

1. 2  Theoretical  Foundation 

The  standard  problem  of  concave  programming  can  be  written  in  the 
form  of  (R*q)  with  a^  fixed.  For  simplicity  of  notation,  we  write 
f(x;a)  for  af^(x)  +  (l-a)f2(x).  Hence  (KKq)  written  as 

(KXq)  Maximize  f(xja^)  subject  to  g(x)  >  0  . 
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Fundamental  theoretical  results  concerning  this  problem  have  been  given 
by  Kuhn  and  Tucker  (l95l)*  A  version  of  their  Theorem  3  is  recorded 
here  without  proof. 

theorem  (Kuhn- Tucker) I 

Consider  (R*q)  with  fixed.  Let  f(-xja^)  and  g^(x) 

(i  =  l,...,m)  be  differentiable  on  the  feasible  region  {x:  ^(x)  >  O], 

let  f(x;a^)  be  concave  on  the  feasible  region,  and  let  gj^(x) 

(i  =  l,...,m)  be  concave  on  Assiame  that  the  constraint  functions 

satisfy  the  Kuhn- Tucker  Constraint  Qualification  (see  the  remark 
following  the  statement  of  the  theorem) . 

Then  x°  is  an  optimal  solution  of  (R*q)  only  if  there 

exist  real  m  numbers  \?  such  that  satisfies  the  following 

(Kuhn- Tucker)  conditions^  at  a  =  a^: 

(5)  S  “2 

(6)  ^  0  »  i  =  1,  ...  ,  m 

(7)  [=1°  implies  °  ^  i  =  1,  . . .  ,  m  . 

Remark:  For  a  statement  eind  discussion  of  the  Kvihn- Tucker  Constraint 
Qualification,  see  Kuhn  and  Tucker  (195I7  p.  ^85)  or  Arrow, 
Hurwicz,  and  Uzawa  (1961).  It  has  been  shown,  for  example, 
that  if  all  the  constraints  are  linear  then  this  qualification 


^  The  symbol  \7  denotes  the  gradient  of  a  function  of 


e.g. 


^f(iE) 


several  variables. 
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is  satisfiedj  and  that  the  existence  of  an  interior  point 
of  the  feasible  region  is  also  sufficient  for  the  qualifi¬ 
cation  to  be  satisfied.  The  sufficient  condition  which  will 
be  of  direct  use  in  the  sequel  is:  if  x*(a^)  is  an  optimal 
solution  of  (Kx^),  then  the  matrix  whose  rows  are 

»  i  such  that  g.  (x*(a  ) )  =  0,  is  of  maximal 
rank  (see  Arrow,  Hurwicz,  and  Uzawa,  1961) . 


Direct  analytical  or  nvunerical  attempts  to  satisfy  these  conditions 
have  proven  quite  difficult,  in  general. 

We  shall  find  the  following  equivalent  version  of  the  Kuhn- Tucker 
Theorem  more  suitable  for  our  purposes. 


Theorem  (Kuhn- Tucker,  an  alternate  version): 

Assume  that  the  hypotheses  of  the  Kuhn- Tucker  Theorem  are  satisfied. 
Then  x°  is  an  optimal  solution  of  (Pat^)  if  and  only  if  there 
exist  m  real  numbers  u?  and  a  subset  S°  of  constraint  indices 
such  that  (x°,u°,S°)  satisfies  the  following  conditions  at  a  =  a  : 


(KT-1) 


(=s)a ; 


(KT-2) 


\^f(xja)  +  2)  'iiVx^i^-^  =  2. 

s 

g^(x)  =  0,  Vies 


u^  =  0,  V  i  S 


(KT-5)  g^(x)  >0,  V  i  ^  S 

(KT-4)  u^  >  0,  V  i  e  S  . 
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Equations  (KT-l)  and  (KT-2)  appear  so  often  together  in  the  sequel 
that  we  introduce  the  sjiecial  symbol  (=S)a  to  denote  them  (in  this 
notation,  S  and  a  may  vary).  We  also  denote  the  set  of  the  first  m 
positive  integers  by  M. 

The  equivalence  of  the  two  versions  of  this  theorem  follows  from 
the  easily  verified 

Proposition  1; 

(i)  If  satisfies  (5)  throiigh  (7)  at  a^,  then 

(x°,\°,S°)  satisfies  (KT-l)  through  (KT-4)  at  for 

o 

any  S  satisfying 

(8)  (i  e  M:  \°  >  0)CS°O{i  e  M:  g^(x°)  =  0)  . 

(il)  If  (x°,u°,S°)  satisfies  (KT-l)  through  (KT-4)  at  a^, 
then  (x°,u°)  satisfies  (5)  through  (7)  at  a^. 

The  numbers  or  u^  will  be  referred  to  as  dual  variables. 

In  view  of  Proposition  1  it  is  useless  to  distinguish  between  \  and 
u;  henceforth  we  shall  use  the  symbol  u  to  refer  to  the  dual  vari¬ 
ables  of  either  version  of  the  Kuhn- Tucker  Theorem. 

The  concept  of  a  valid  set  plays  a  central  role  in  this  work. 

A  subset  S°  of  constraint  indices  is  said  to  be  valid  at  a  if 

-  o 

and  only  if  there  exists  (x°,u°)  such  that  (x°,u°,S°)  satisfies 

(KT-l)  through  (KT-4)  at  a  . 

o 
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Proposition  2: 

A  subset  S°  of  constraint  indices  is  valid  at  ot^  if  and  only 
if  S°  satisfies  (8)  for  some  (x°A°)  which  satisfies  (5) 
through  (7)  at 


Proof:  Assume  that  S°  is  valid  at  a  .  Then  there  exists  (x°,u°) 
-  o  —  — 

such  that  (x°,u°,S°)  satisfies  (KT-l)  through  (KT-4)  at  a^,  which 
implies  by  part  (ii)  of  Proposition  1  that  satisfies  (5) 

throiigh  (7)  at  a^.  By  (KT-2)  and  (KT-4)  ,  {i  e  M:  ^  0)OS°  holds. 

By  (KT-2),  S°CI  (i  €  M:  g^(x°)  =  O)  holds.  This  proves  necessity. 

Assimie  now  that  S°  satisfies  (8)  for  some  (x°A°)  satisfying  (5) 

through  (7)  at  a^.  By  part  (i)  of  Proposition  1,  (x°,K°,S°)  satisfies 

(KT-l)  through  (KT-4)  at  a^,  which  shows  that  S°  is  valid  at  a^. 

The  alternate  version  encourages  the  important  observation  that  the 
Kuhn- Tucker  Conditions  may  be  viewed  as  the  Lagrange  multiplier  equationa^ 


2/  The  method  of  Lagrange  multipliers  (see,  e.g. ,  Apostol,  1957>  P*  155) 
gives  a  set  of  first  order  necessary  conditions  for  a  point  x°  to  be 
an  optimal,  solution  of  the  problem 

Maximize  f(x)  subject  to  g^(x)  =  0/  i  =  1,  . . .  ,  m  . 

X 

Assume  that  f(x)  and  g^(x)  (i  =  l,...,m)  are  continuously  differen¬ 
tiable  on  seme  open  region  containing  the  feasible  region,  and  that  the 
matrix  whose  rows  are  ^c6i(x°)^  i  =  1,  ...,m,  is  of  maximal  rank  (note 
that  this  last  assumption  implies  that  m  <  n,  where  n  is  the  dimension 
of  x) .  If  x°  is  an  optimal  solution  of  the  above  problem,  then 
there  exist  m  real  numbers  such  that  (x°,\°)  satisfies  the 

(Lagrange  multiplier)  equations: 

\V3^6i(x)  =  0  and 
i 

g^(x)  =0,  i  =  1,  . . .  ,  m  . 
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^PPliGd  to  a  subset  S  of  the  constraints,  augmented  by  the  inequations 
(KT-3)  and  (KT-4).  Attention  thereby  focuses  on  discovering  the 
identity  of  a  valid  set,  for  if  one  knew  a  valid  set  S*  then  in 
principle  one  could  solve  {=S*)a^  for  all  solutions  (x',u'), 
among  which  at  least  one  would  satisfy  (KT-5)  and  (KT-4)  and  hence 
solve  (P3£q)-  Indeed,  at  least  one  algorithm  (see  Theil  and  Van  de 
Panne,  i960,  and  also  Boot,  I961)  has  already  been  proposed  which  is 
essentially  aimed  at  determining  a  valid  set.  However,  this  approach 
is  probably  not  very  efficient  computationally,  for  although  it  reduces 
the  concave  programming  problem  to  one  of  solving  sets  of  simultaneous 
equations,  there  is  a  vast  number  of  candidate  sets  of  equations  to 
be  tried  vrtien  a  valid  set  is  not  known.  It  seems  to  be  difficult,  even 
for  problems  of  modest  size,  to  know  how  to  order  the  trials  so  as  to 
keep  the  number  of  erroneous  trials  at  a  reasonable  level.  This 
combinatorial,  difficulty  is  further  aggravated  by  the  numerical  burden 
of  actually  solving  (=S)ot^.  Thus  we  may  expect  the  customary  gradient 
methods  to  be  more  efficient  than  methods  based  on  the  "valid  set 
approach. " 

Let  us  turn  now  to  parametric  programming.  It  is  perhaps  surprising, 
in  view  of  the  inmediately  preceding  comments,  that  here  methods  based 
on  the  valid  set  approach  seem  to  have  the  advantage  over  gradient 
methods.  In  fact  the  parametric  programming  algorithms  ( cf .  section  3 
of  Chapter  II)  of  Markowitz  (I956),  Houthakker  (i960),  and  Zahl  (1964) 
each  may  be  viewed  as  maintaining  the  identity  of  a  valid  set  as  a 
parameter  is  varied. 
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Under  appropriate  assumptions  the  optimal  solution  ^(a)  of 
(itit)  and  the  associated  duad.  variables  u*(o{)  are  unique  and  con¬ 
tinuous.  This  fact,  coupled  with  the  observation  that  there  is  only 
a  finite  number  of  subsets  of  constraints,  suggests  that  if  S'  is 
valid  at  say,  then  S'  is  likely  to  be  valid  in  some  interval 

including  a^.  If  this  is  the  case,  then  one  may  derive  x*(a)  and 
u*(a)  in  that  interval  by  solving  (=S’)a  parametrically,  and  (KT-3) 
and  (KT-4)  are  automatically  satisfied.  If  this  is  not  the  case, 
then  even  though  (=S*)a  may  have  a  solution  near  a^,  either  (KT-3) 
or  (KT-4)  will  be  violated,  £ind  it  is  necessary  to  find  a  new  valid 
set  before  being  able  to  proceed.  Because  of  continuity,  moreover, 
a  set  which  is  valid  near  will  usually  differ  by  only  a  few 

constraint  indices  from  S'.  This  approach  leads  to  a  decomposition  of 
(Bat)  on  [0,1]  into  a  chain  of  parametric  subproblems.  Each  sub¬ 
problem  involves  the  parametric  solution  of  the  Lagrange  multiplier 
equations  associated  with  the  constraints  specified  by  a  constant  valid 
set  on  a  subinterval  of  [0,lj.  By  continuity  the  optimal  terminal 
solution  to  one  subproblem  is  the  optimal  initial  solution  to  the 
next  subproblem  of  the  chain,  and  the  valid  sets  of  adjacent  sub¬ 
problems  are  both  valid  at  the  transition  point  between  them. 

Thus  parametric  programming  can  be  reduced  essentially  to  the 
problem  in  numerical  analysis  of  solving  parameterized  (nonlinear, 
in  general)  simultaneous  equations.  This  ajjproach  to  parametric 
programming  txirns  out  to  be  a  useful  one  computationally,  since  the 
systems  of  equations  involved  will  be  shown  to  be  well-behaved.  By 
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applying  Hewton's  method  (see  Appendix  c),  second  order  convergence 
can  be  achieved  as  the  parameter  increases  by  discrete  increments, 
whereas  gradient  methods  display  roughly  first  order  convergence. 

2. 

In  this  section  we  state  and  prove  a  Basic  Conceptual  Algorithm 
for  solving  (Pa)  for  each  value  of  a  in  the  vmit  interval.  We 
use  the  adjective  "conceptual"  because  computational,  implementation 
is  not  considered  at  this  point  of  the  exposition.  Hie  Basic  Con¬ 
ceptual  Algorithm  can  be  modified  and  implemented  in  various  ways, 
as  will  be  indicated  in  sections  3  and  4,  thus  giving  rise  to  an  entire 
class  of  computational  algorithms. 

2.1  Assxmptions 

We  assume  that  an  optimal  solution  of  (Hi)  is  available  for  some 
value  of  a  in  the  unit  interval,  say  a  =  0  (in  view  of  the  dis¬ 
cussion  of  subsection  1.1,  this  assumption  is  not  restrictive). 

Throughout  this  work  the  following  conditions  will  be  imposed 
upon  (Rx).  We  denote  the  feasible  region  (x;  g(x)  >0}  by  X. 

Condition  1;  The  functions  f^(x)  (i  =  and  g^(x) 

(i  =  l,...,m)  are  analytic  on  sane  open  region 
containing  X,  and  the  constraint  functions  eire 
Concave  on  e". 

Condition  2;  X  is  non-empty  and  bounded. 


A  Basic  Conceptual  Algorithm 
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Condition  3: 


are 


Kie  hessian  matrices^^'^  Vx  f^(x)  (i  =  1,2) 
negative  definite  for  all  x  e  X. 

Condition  If  e  [0,1]  and  3^(a^)  is  an  optimal  solution 
of  (Rx^),  then  the  matrix  whose  rows  are  the 
gradients  V^i  (**(“„))  ,  i  such  that 
g.(x*(a^))  =  0,  is  of  maximal  rank. 

A  function  f(x^,...,x^)  of  n  real  variables  is  said  to  be 
ana.ytic  in  a  region  R  if  in  some  neighborhood  of  every  i)oint  of  R 
the  function  is  the  sum  of  a  convergent  power  series  with  real  coeffi- 
cients.  The  class  of  all  analytic  functions  includes,  for  example, 
all  polynomiails,  and  semns  amply  wide  enough  to  include  nearly  any 
continuous  function  likely  to  be  encoiontered  in  applications. 

Conditions  1  and  2  imply,  by  A.l  of  Appendix  A,  that  X  is 
convex  and  compact. 

Condition  3  implies,  by  A. 5,  that  f^  and  f^  are  strictly 
concave  on  X.  This,  in  turn,  implies  by  A. 4  that  f(x;a)  =  af^(x)  + 
(l-a)f2(x)  is  strictly  concave  on  X  for  each  fixed  value  of 
a  e  [0,1].  In  the  presence  of  Conditions  1  and  2,  this  last  assertion 
remains  true  even  on  some  open  interval  containing  [0,1],  as 
Proposition  3  shows. 
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Proposition  3: 

Assxime  that  Conditions  1, 

f(x)  denotes  the  n  by 


^^f(x) 

Sx.dx. 
1  3 


2, 


n 


and  3  hold.  Then  ^^f(xia) 


matrix  whose  ij-th  element 


is 


is 


6k 


negative  definite  on  X  for  each  fixed  value  of  a  in  some 
open  interval  containing  [0,1]* 

&oof:  It  is  well-known  that  f(x}a)  is  negative  definite  at 
(x,a)  if  and  only  if  all  of  its  eigenvalues  6j^(Vx 

are  negative,  i.e.,  if  f(xia))  <  0.  Assume  for  the  moment 

that  the  last-mentioned  function  is  continuous  in  (x,a)  on  some  open 
region  containing  X  x  [0,1],  where  x  denotes  the  Cartesian  product. 
Since  a  positive  sum  of  negative  definite  matrices  is  again  negative 
definite,  from  Condition  5  it  follows  that  Max  £  f{xia))  <  0 

^  4  X  "• 

on  X  X  [0,1].  The  proposition  follows  from  this  fact,  the  assumed 
continuity,  and  the  compactness  of  X  x  [0,1]. 


To  see  that  is  continuous  on  seme  open 

region  containing  X  x  [0,1],  observe  that  Condition  1  implies  that 
the  elements  of  f(x;0{)  are  all  continuous  on  some  open  region 
containing  X  x  [0,1].  Since  the  eigenveilues  of  a  square  matrix  are 


continuous  functions  of  its  elonents  (Ostrovski,  i960, 
f(x)a)) 

is  therefore  continuous 


P.  192), 
on  scmie  open 


region  containing  X  x  [0,1] j  the  same  must  be  true  for 

Max  f(x;a)). 

|1  |1  X  — 


Remark:  As  indicated  in  Footnote  1  of  this  chapter.  Condition  3  may 
be  weakened  to  (in  the  following,  e  >  0  is  arbitrarily 
s®s^) :  (a)  ^  ^Vx  negative  (semi-) 

all  X  e  X,  if  [0,1]  is  replaced  by  [e,!]. 
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O  p 

or  (b)  ^2^—^  ^Vx  negative  (semi-)  definite 

far  all  x  c  X,  if  [0,1]  is  replaced  by  [0,l-e],  or 
(c)  aVj  fi(x)  +  (i-a)Vx  fgCx)  is  negative  definite  for 
all  X  e  X  at  each  a  e  (0,l)^  if  [0,1]  is  replaced 
by  [e,l-€]. 

Condition  4  is  equivalent  to  requiring  that  the  greidients 
Vx  ^  ^  linearly 

independent;  hence  at  most  n  constraints  can  be  satisfied  with 
exact  equality  at  an  optimal  solution  of  In  the  remark 

following  the  Kuhn- Tucker  Theorem,  it  was  noted  that  this  condition 
implies  that  the  Kiihn- Tucker  Constraint  Qualification  holds.  Thus 
the  hypotheses  of  the  Rahn- Tucker  Theorem  are  satisfied  by  (R*q) 
for  each  fixed  c  [0,1]  when  Conditions  1,  5,  and  4  hold. 

2. 2  Statement  of  the  Basic  Conceptual  Algoritbin 

For  convenience  we  view  a  as  increasing  frcn  0  toward  1. 

Step  1:  Solve  (Po)  by  any  convenient  method,  so  that 

(2^(0),  u*(0),  S*)  satisfying  (KT-l)  throiagh  (KT-4) 
at  a  =  0  is  at  hand.  Put  a°  =0,  S°  =  S*,  and 
(x,u)°  =  (3^(0),  u*(0)). 

Step  2;  Solve  equations  (=S°)a  by  any  convenient  method  as 

a  increases  above  a°  for  the  unique  continuous 

5 /  s^  s^ 

solution-/  (x  (a),  u  (a))  satisfying  the  left 

^  Throughout  this  work  we  employ  the  symbol  (x^(a),  u^(a))  to 
denote  a  solution  of  equations  (=S)a.  ~ 
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end-point  value  (x,u)°  so  long  as  this  solution  satisfies 
(KT-5)  and  (KT-4);  that  is,  until  a  «  a',  where 

a-  ^  Max  (a:  a°  <  a  <  l,  g^(xS°(a))  >  0,  Y  i  ^  s°, 

4  (C‘)>0,  Y  ±  e  S°  on  [a°,  a*]}  . 

terminate.  Otherwise  put  (x,u)°  = 

(x  (Of*),  u®  (a*))  and  go  to  step  3. 

St^:  Solve  equations  (=s)a  by  any  convenient  method  as 
a  increases  above  a*  for  the  unique  continuous 
solution  (x®(a),  u®(a))  satisfying  the  left  end-point 
value  (x,u)°  for  different  sets  S  which  satisfy 

ts-l)  (1  e  M,  uf(a')  >  olCsCli  e  M:  gj(/V'))  -  0) 

until  S.,  u=’(a))  u.tlsfies  (KT.J) 

and  (KT-4)  on  [a',a'+€]  for  some  c  >  0.  Put 
o  I  o 

Of  =  a  ,  s  =  S*,  and  return  to  Step  2. 

The  next  subsection  is  devoted  to  the  development  of  the  theo¬ 
retical  results  necessary  for  Justifying  this  conceptual  algorithm. 

Complete  Justification  requires  proof  of  the  following 

Theoran  (Basic) : 

Assume  that  Conditions  1  through  k  hold.  Then  the  following 
assertions  regarding  the  Basic  Conceptual  Algorithm  hold: 
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(i)  Step  2  is  well-defined. 

(ii)  At  each  execution  of  Step  2,  (x®  (a),  u®  (a))  =  (x*(a), 

u*(a))  on  [a°,a’ ]. 

(iii)  Step  5  is  well-defined. 

(iv)  Step  5  will  be  executed  only  a  finite  number  of  times 
before  termination  obtains. 

2. 3  Theoretical  Development 

Continuity  plays  a  crucial  role  in  parametric  programming. 

Theorem  1  (Continuity) : 

(i)  Assume  that  Conditions  1  through  5  hold.  Then  (itt)  has 

a  unique  optimal  solution  x*(a),  and  x*(a)  is  continuous 
on  scxne  open  interval  containing  [0,1]. 

(ii)  Assume  that  Conditions  1  through  4  hold.  Then  (Kt)  has 
unique  dual  variables  u*(a)  (i  =  1,  . . .  ,m)  such  that 
(3^(a) ,  u*(a))  satisfies  the  Kuhn- Tucker  Conditions  (5) 
through  (7),  and  u*(a)  is  continuous,  on  some  open  interval 
containing  [0,lj. 

ft;oof:  First  we  prove  (i).  The  existence  of  an  optimal  solution 
of  (Rj)  for  any  fixed  value  of  a  follows  from  the  fact  that 
f(x;a)  is  a  continuous  function  of  x  on  the  ccanpact  set  X.  The 
uniqueness  of  the  optimal  solution  follows  by  A.  2  from  the  fact  that 
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f(x;a)  is  strictly  concave  in  x  over  the  convex  set  X  for  each 
fixed  value  of  a  in  some  open  interval  ^  containing  [0,1].  Denote 
the  unique  optimcLl  solution  by  x*(a) . 

To  demonstrate  that  £»(a)  is  continuous  on  ,  suppose 

the  contrary.  Then  there  exists  a  sequence  <  a'^  >  -►  a  with  d , 

act/  such  that  <  x*(a'')  >  /  ^(5) .  Hence  there  is  an  (open) 
neighborhood  K(x*(a))  of  ^(a^  such  that  ^(a'')  i  N(x»(a)) 
infinitely  often,  and  by  taking  a  subsequence,  if  necessary,  we  may 
assume  that  this  holds  for  all  v.  Since^  X-N(x»(a))  is  compact 
ve  nay  assume,  again  taking  a  subsequence  if  necessary,  that 
<  x*(a  )  ^  x'  €  {X-N(x*(a) )  }.  Hius  by  the  continuity  of  f(x}a) 
with  respect  to  (x,a),  we  obtain 

(9)  <  f  (x*(a'')  ja'')  >-  f(x';a)  . 

Now  f  (x*(a)  ja)  s  Max  {f(x;a)  subject  to  g(x)  >0)  is  the 

supremuE  of  a  family  of  functions  linear  in  a,  and  therefore  is 

convex  in  a  on  Using  A.  5,  we  obtain 

<  f  (x*(a'')  id)  >  -*  f  (x*(a)  ;a)  . 

Assertions  (9)  and  (10)  imply  that  f(x';a)  =  f  (x*(a)  ;a)  j  but  by 
construction  x'  ^(a),  so  that  the  uniq[ue  optimality  of  x*(a) 

d  When  used  with  sets,  the  symbol  denotes  relative  complement. 

Thus  X-N(^(a))  =  (x  €  X:  x  ^  H(3^(a))). 
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IS  violated.  Hence  x*(a)  jssust  be  continuous  on  !Hiis  completes 

the  proof  of  (i). 

Now  we  prove  (ii).  The  existence  of  u*(a)  such  that  (x*(a), 

^  satisfies  (5)  through  (7)  on  some  open  interval  containing 
[0,1]  would  follow  frcm  the  necessity  of  the  Kuhn- Tucker  Conditions 
if  the  hypotheses  of  the  Kuhn-ihcker  TSieorem  were  satisfied  by  (B3f) 
on  such  an  interval.  It  was  noted  in  subsection  2.1  that  these 
hypotheses  are  satisfied  for  each  value  of  a  e  [0,1].  To  show  that 
this  retains  true  on  some  open  interval  containing  [0,1],  in  view 
of  Condition  1,  Proposition  5,  and  the  remark  following  the  statement 
of  the  Kiihn- Tucker  Theoraa,  it  is  enough  to  show  that  Condition  k  is 
still  satisfied  on  some  open  interval  containing  each  end-point. 
Consider  the  left  end-point  a  =  0.  Denote  by  D(a)  the  matrix  whose 
rows  are  (x>t(a) ) ,  i  such  that  g^(x*(0))  =0.  By  Condition  4 

applied  at  a  =  O,  D(0)  has  rank  equal  to  the  number  of  its  rows, 
which  is  equivalent  to  the  existence  of  [D(0)D*(0) ]~^,  which  is 
equivalent  to  the  deteminantal  inequality  |d(0)D*{0)|  ^  0.  Since 
I  D(a)  D  (a)  |  xs  a  continuous  function  of  a  for  a  sufficiently 
near  0,  it  does  not  vanish  in  some  open  interval  containing  a  =  0, 
and  so  D(Qt)  r«aains  of  maximal  rank  on  such  an  interval.  IHiis 
t®plies  that  Condition  4  holds  on  some  oi)en  interval  containing  Ot  =  0, 
for  by  the  continuity  of  ^(a)  and  of  g.  (x),  aind  hence  of 
gj^(x»(a)),  one  easily  obtains  that  [i:  gj^(3^(a))  =  0}C 

=  0}  for  Of  sufficiently  near  0.  A  similar  argu¬ 
ment  applies  to  a  =  1. 
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on  some 


To  shotr  the  uniqueness  and  continuity  of  u*(a) 

open  interval  containing  [0,1],  fix  c  [0,1].  Since  x*(a)  is 

unique,  from  (7)  we  conclude  that  u*(a^)  must  vanish  for  each  i 

such  that  g^(£*(otQ))  >0.  By  the  continuity  of  g^(^(a)),  we 

have  that  g  (x*(a) )  >  0  on  some  open  interval  aibout  a  when 

o 

g^(x*(a^))  >  0.  Hence  u*(a)  vanishes  on  some  open  interval  about 

for  each  i  such  that  gj^(x*(a^))  >  0.  Denote  [i:  g^(^(a^))  =  0] 
by  3.  It  remains  to  consider  u^(a),  i  e  B.  From  (5)  and  (7)  one 

obtains 

(11)  +  S  =  0  • 

i  cB 

Since  by  continuity  [i:  g^(x*(a))  =  0]C{i:  g^(^(a^))  =  o]  =  B 

for  a  sufficiently  near  a^,  it  follows  fran  (5)  and  (7)  that  (ll) 
must  hold  in  scxae  open  interval  about  with  the  same  sumnation  set. 

That  is, 

(12)  +  Z  \Ig.(x*(a))  =  0 

i€B  *  ^  ~ 

holds  on  some  open  interval  about  a^.  Write  ^(a)  for  the  row 
vector  whose  components  are  u*{a),  i  e  B.  Then  (12)  can  be  rewritten 
in  matrix  notation  as 

(12-1)  ^(ajD(a)  =  -  V^f(x*(a);a)  . 

Rei)eatxng  a  previous  argument,  one  may  assert  that  [D(a)D^(a)  ]”^ 
exists  on  some  open  inteival  containing  a^.  Postnniltiplying  (12. l) 
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by  (a)  [  D{a)^  (a)  one  obtains  that  u^(of)  must  satisfy^^ 

(12. 2)  (i£*(a)  ;0t)  D*  (a)  [  D(a)^ (a) 

on  sane  open  interval  containing  a^.  Ihe  right-hand  side  is  vinique 
and  continuous  in  a,  and  therefore  u^(ot)  is  also  vinique  and  con¬ 
tinuous  on  some  open  interval  containing  a^. 

It  vill  prove  convenient  to  introduce  some  special  notations. 
Define  Aat  to  be  the  set  of  constraint  indices  corresponding  to  the 
constraints  which  are  active  at  a  in  the  sense  that  their  dual 
variables  are  strictly  positive: 

/a  ^  (i  €  M:  u*(a)  >  O)  . 

Define  SCt  to  be  the  set  of  constraint  indices  corresponding  to  the 
constraints  which  are  binding  at  x*(a) : 

Bat  =  (i  €  M:  g^(2^(a))  =  O)  . 

The  sets  fCL  and  Ba  are  well-defined  on  seme  open  interval  con¬ 
taining  [0,lj  because  of  the  existence  and  uniqueness  of  (x*(a), 
u*(a) )  on  some  such  an  Interval.  We  can  now  state  two  important 
corollaries  of  Theorem  1. 


Corolj.ary  1.1: 

Assume  that  Conditions  1  throiigh  4  hold.  Then  for  each  e  [0,1] 


w 


Eq’uation  (12.2)  is  intended  only  for  theoretical  and  not  compu¬ 
tational  use. 
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there  exists  an  open  interval  containing  such  that,  on 

this  interval. 


>a  C:  A3f  CZ  BOf  C.Ba  . 

o  ^  o 

Proof:  The  outermost  relations  follow  directly  from  the  definitions 
of  la  and  BOf  and  the  continuity  of  3^(a)  and  u*(a).  The  middle 
relation  follows  from  (7). 

Corollary  1.2: 

Assume  that  Conditions  1  through  h  hold.  Then  there  is  an  open 
interval  containing  [0,1]  such  that,  for  each  fixed  value  of 
at  in  this  open  interval,  a  subset  S  of  constraint  indices  is 
valid  at  a  if  and  only  if  faCZS  CBa. 

Proof:  This  assertion  is  an  iimaediate  consequence  of  the  unique¬ 
ness  of  (x»(a),  u»(a))y  and  Proposition  2. 

The  significance  of  Corollaries  1.1  and  1.2  is  that  the  totsLLity 

of  valid  sets  at  e  [0,1]  contains  the  totality  of  valid  sets 

for  a  sufficiently  near  a^.  Hence  the  optimal  solution  of  (R*^), 

which  yields  /a^  and  gives  a  strong  indication  of  the  identity 

of  a  valid  set  for  a  near  a  . 

o 

The  next  theorem  shows  that  equations  (=S)ot  csui  be  solved  on 

some  open  interval  about  a  e  [0,1]  if  S  is  valid  at  a  . 

o  o 

Theorem  2: 

let  e  [0,1]  be  fixed,  let  S  be  valid  at  a^,  and  assume 
Conditions  1  through  4  hold. 
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Then  there  exist  an  open  interval  la^  containing  and  symmetric 
about.  and  an  open  neighborhood  N(3^(ct^) ,  u*(q[^)  )  containing 

u*(a^)),  such  that  on  la^  there  is  a  unique  function 

s  s 

(x^(a),  u^(a)}  in  K(3^(a^),  u*(a^))  which  satisfies  (=S)a. 

s  s 

Furthermore,  (x  (a),  u  (a))  is  analytic  on  la  . 

—  —  o 

Proof:  The  theorem  would  follow  directly  from  a  version  of  the 
Implicit  Function  Theorem  (Bochner  and  Martin,  19l*8,  p.  39)  applied 
to  the  equations  (=S)a  if  the  following  hypotheses  of  that  theorem 
were  satisfied: 

(a)  (^(a^),  u*(a^))  satisfies  (=s)a^. 

(b)  The  left-hand  side  of  each  equation  of  (=S)a  is  analytic 

in  (x,u,a)  in  cm  open  neighborhood  of  ,  u*(a^),a^). 

a((=s)a  ) 

(c)  The  Jacobian  —  is  non-zero  at  (2C*(a^),  u*(a^)). 


By  the  validity  of  S  at  a°,  part  (i)  of  Proposition  1  and  Corollary 

1-2,  (a)  holds.  It  follows  free:  Condition  1  that  (b)  holds.  To 

simplify  the  task  of  showing  that  (c)  holds,  we  regroup  the  order  of 

lertial  differentiation  ,  which  is  equivalent  to  regrouping  the  columns 

of  the  Jacobian  matrix,  so  that  we  actually  consider  the  Jacobian 

S((=s)a^;) 

ieS;  u, , 

—  i  i 

and  D  for  the  matrix  whose  rows  are  ^  i  e  S,  one 


Ms)' 


Writing  H  for  the  n  by  n  hessian  matrix 


7U 


readily  derives  that  this  Jacobian,  evaluated  at  (x*(a  ),  u*(o  )), 
is  the  deterainant  of  the  Matrix  (we  use  dotted  line  to  denote 
partition) 

'h  ;  jo' 

D  T  0  !  o 

O  I  0  I  I 

_  •  I  _ 

Trtiere  O  and  I  are  zero  and  identity  matrices  of  the  appropriate 
orders.  The  determinant  is  non-zero  if  and  only  if 


m 


is  invertible,  irtiich  is  true  if  and  only  if  the  matrix  equation 


■  (I) 


has  y  -  O,  z  -  0  as  its  only  solution,  where  y  is  an  n- vector 
and  z  is  a  vector  with  a  number  of  components  equal  to  the  number 
of  constraint  indices  in  S.  The  proof  of  the  theorem  will  be  com¬ 
plete  '.dten  we  show  that  (l?)  has  only  the  null  solution. 

Performing  the  indicated  block  multiplications  for  (13) ,  one 
obtains 


(13.1) 


Hy  +  D  z  =  O  and. 


(13.2) 


=  o  . 
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f 


o 


I 

■  How  H  is  negative  definite,  for  it  is  a  positive  linear  combination 

■  of  negative  seaidefinite  hessians,  at  least  one  of  which  is  known  to 

fc 

be  negative  definite.  Hence  H  is  invertible,  and  (I3.I)  yields 
*  (15-5)  y  =  -H'Vz  . 

t 

j  PreHTjtltipljring  (15-3)  by  D  and  using  (13*2),  one  obtains 

(15-^)  =  -ra“  Vz  =  ^  . 

Bji*  Corollary  1.2,  Hy  Condition  4,  therefore,  D 

is  of  saxijcal  rank,  and  that  rank  equals  the  number  of  rows  of  D. 
Hence  is  invertible,  and  (13*4)  yields  £  =  0.  By  (13.3), 

y  =  0  also.  Thus  (13)  has  only  the  null  solution. 

Corollary  2.1; 

Let  €  (0,1]  be  fixed,  let  S  be  valid  at 

that  Conditions  1  through  4  hold. 

llien  there  exists  an  open  interval  containing 

in  la  such  that,  for  each  fixed  value  of  a 
o 

the  following  three  assertions  are  eqpii valent: 

(i)  S  is  valid  at  a. 

(ii)  (x®(a),  u®(a))  =  (x*(a),  u*(a))  . 

(iii  )  gj^(£^(a))  >0,  V  i  ^  s 

uf(a)  >0,  Vies. 

1  — ' 


O 

a  ,  and  assume 
o 


a  and  contained 


in  this  interval. 
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Rroof:  (i)“=>  (ii).  continuity,  (x*(a),  u*(a))  €  N(^(a^), 
u*(a^))  for  all  a  sufficiently  near  a^;  by  the  validity  of  S 
at  a,  pio-t  (i)  of  Proposition  1,  and  Corollary  1.2,  one  concludes 
that  (x*(a),  u*(a))  satisfies  (=8)0;  since  the  solution  of 
(=S)a  is  unique  in  N(x*(a^) ,  u*(a^) )  for  a  e  la^,  assertion 
(ii)  follows - 

(ii) “=2>  (iii).  Because  (x*(a),  u*(a))  satisfies  (5)  through 
(7),  (iii)  atust  hold. 

(iii) =>  (i).  Assertion  (iii)  and  the  fact  that  (x®(a), 

s 

u  (a))  satisfies  (=S)a  inply  by  the  definition  of  validity  that 
S  is  valid  at  a. 

One  atore  result  oust  be  established  before  a  complete  proof  of 
the  Basic  Theorem  can  be  given. 

Define  a  point  of  change  of  B3  as  a  point  a’  with  the  i»ro- 
perty  that  there  is  no  open  interv^LL  containing  a'  such  that 
33  =  33’  everywhere  on  that  interval.  A  similar  definition  holds 
for  a  point  of  change  of  ta.  In  the  sequel,  the  phrase  "point  of 
change"  is  used  to  refer  to  either  a  point  of  change  of  fee  or  of 
33,  or  possibly  of  both. 

Theorem  3  (Finiteness) : 

Assume  that  Conditions  1  through  h  hold.  !nien  A3  and  B3 
each  nave  a  finite  number  of  points  of  change  on  [0,1]. 

^oof:  Suppose  that  B3  has  a  finite  number  of  points  of 
change  on  [0,1].  Then  there  is  a  cluster  point  Of  e  [0,1]  of 
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tJiese  points  of  change.  Let  <  >,  o'*  e  [0,1],  be  a  sequence  of 

distinct  points  of  change  of  BDt  %rtiich  converges  to  oc.  Applying 

V 

Caroilary  1.1  at  ot  ,  ve  see  that  there  exists  an  open  interval 
containing  a''  such  that  Aa^ci  Aae  d  Ba  CZlBa^  on  this  interval. 

gjr  the  definition  of  a  point  of  change  of  Bet,  for  each  o'*  there 

V  .  V  1 

exists  a  nuariber  P  contained  in  this  internal  and  in  (a  -  — , 

at''  +  ^)  such  that  Aa'’^A3''^Bp'' OBOi''  (note  that  Bp''  is  a 

proper  subset  of  Ba'')  .  Clearly  <  p''  > -*  a.  From  Corollary  1.1 

appired  at  Ct,  we  see  that  we  have  daaonstrated  the  existence  of 

two  seq^ieaces  <a  >  —  a,  <p  >  —  a,  such  that  Ax  C -Act  Cl 

AP  C  BP  C  3a  Ba  for  all  v  sufficiently  large.  Since  there 

is  but  a  finite  nuatber  (2*^)  of  x^ssible  sets  which  Bp''  or  Ba'' 

could  p>Qssibiy  be,  we  may  assume,  taking  a  subsequence  if  necessary, 

that  there  exist  sets  B'  and  B"  such  that  Bp'*  =  B”CZBa''  =  B' 

for  a.T  I-  V. 

B” 

Consider  the  function  x  (a)  defined  as  in  Theorem  2  applied 

at  a.  Since  E”  is  valid  at  a  and  at  all  a''  and  p'',  v 

3” 

sufficiently  large,  x  (a)  =  x*(a)  at  these  points.  Take  i  e  B'-B" 

—  o 

Then  g.  (x  (a  ))  =  0  and  g  (x®  (p''))  >  0,  all  v  sufficiently 
B"  ° 

large,  and  g^  (x  (5))  =  O.  In  other  words,  we  have  shown  that 

_  ®  pii 

a  is  a  non-isoiated  zero  of  (x  (a)),  and  that  this  function  is 

o 

not  identically  zero  on  any  open  interval  about  a.  But  this  leads 

to  a  contradiction  of  the  well-known  fact  (Apostol,  1957,  p.  5l8)  that 

the  zeros  of  an  analytic  function  which  is  not  identically  zero  are 

isolated,  for  by  Theorem  2  and  Condition  1  we  have  that  g.  (x®  (ot) ) 

o 
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is  aitalytic  on  sotae  open  interval  about  a.  Hence  the  supposition 
that  BOt  has  an  infinite  number  of  points  of  change  on  [0,1]  is 
false. 

A  similar  argument  shows  that  ta  cannot  have  an  infinite  number 
of  points  of  change  on  [0,1]. 


Applying  the  result  of  Theorem  3  to  a  given  (Pa),  define 

O  ^  0^  <  •  •  •  <  <1  to  be  the  collection  of  all  points  of 

cnange  of  tct  or  Ba  or  both.  As  a  matter  of  convention  we  teU^e 

=  O  and  =  1-  From  Corollaries  1.1  and  1.2  we  conclude  that 

any  set  which  is  valid  at  a,  <  a  <  i2  also  valid  on  the 

entire  closed  interval  [a^,  In  addition,  it  may  also  be 

valid  on  other  intervals,  of  course.  Among  the  sets  which  are  valid 

at  a!  there  are  all  those  which  are  valid  on  [al  , ,  a’  ]  or  on 
J  j-1  j 


'“j’ 


We  are  now  in  a  position  to  prove  the  Basic  Theorem. 


(Basic  Theorea);  First  we  prove  parts  (i)  and  (ii) .  At 
the  beginning  of  each  Step  2,  (z,u)°  and  S°  satisfy  (KT-l) 

through  (KT-t)  at  a°,  so  that  S°  is  valid  at  a°  and  (x,u)°  = 
(x*(a  ) ,  !i*(a  ) ) .  Let  J,  ^  ^  ^  K+1  be  the  largest  integer  such 

that  S°  is  valid  on  [a°,  aj]  (aj  =  =  a°  =  0  is  permissible 

the  first  time  Step  2  is  executed).  Applying  Iheorem  2  at  each  point 

[®  »  follows  that  (=S  )(Z  has  a  unique  analytic  solution 

S°  S° 

(z  (a),  n  (a))  satisfying  the  left  end-point  value  (x,u)°  on  some 
interval  containing  [a®,  aj.].  This  solution  satisfies  (KT-3)  and 


79 


(KT-%)  aund  equals  (x*(a) ,  u*(o) )  on  [a®,  aj.]  by  Corollary  2.1. 

If  a’  =  1,  the  solution  of  (P3f)  on  [0,1]  is  complete.  If 
0^  <  1,  however,  (x  (a),  u  (a))  does  not  satisfy  (JCT-J)  and 
(KT-^)  for  any  a  c  (a^,  ,  for  otherwise  by  Corollary  2. 1 

applied  at  aj,  S°  would  be  valid  on  [Cj,  which  would 

violate  the  definition  of  J.  Clearly  the  scalar  a*  defined  in 
Step  2  is  precisely  Cj,  and  (i)  and  (ii)  hold. 

Next  we  prove  (iii).  Any  set  S  which  satisfies  (8.1)  is  valid 

gO  gO 

it  a',  by  Corollary  1.2  and  the  fact  that  (x  (a'),  u  (a’))  = 
(x*(a' )  ,  u*(a* ) ) .  Applying  Theorem  2  at  a' ,  we  see  that  if  S 
satisfies  (8.1)  then  (=S)a  has  a  solution  as  stated  on  [o',  o'+e^^] 
for  some  >  O.  By  Corollary  1.1  we  know  that  at  least  one  such 
3,  say  S',  is  valid  on  [o',  O'+Cg]  for  some  0  <  Eg  <  e^;  by 
Corollary  2.1  applied  at  o',  (x®’(o),  u®  (o))  satisfies  (KT-3) 

and  (CT-h)  on  [o',  O'+c]  for  some  0  <  e  <  Eg.  Since  there  is  but 
a  finite  trxnber  of  sets  satisfying  (8.1),  S'  will  be  found  after  a 
finite  nunber  of  trials. 

Finally,  ve  prove  (iv).  It  was  established  in  the  proof  of  (i) 
that  Step  5  is  entered  each  time  a  point  of  change  O'  is  encountered 
at  Step  2  such  that  the  current  set  S®  being  used  at  Step  2  is  not 
valid  issediately  above  O'.  It  was  established  in  the  proof  of 
(iii)  that  Step  3  finds  a  set  which  is  valid  immediately  above  O' 
in  a  finite  nuEber  of  trails,  and  control  is  retiirned  to  Step  2  along 
with  the  new  valid  set-  Ey  convention  we  have  taken  O  increasing, 
and  by  Theoren  3  there  is  but  a  finite  number  of  points  of  change  on 
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it  follovs  that  Step  3  will  only  have  to  be  executed  a  finite 
nuKber  of  tmes  before  teraination  obtains. 

At  Step  2,  a*  need  not  be  the  ^  point  of  change  above 

a°,  for  s"  aaj^  remain  valid  on  an  interval  spanning  several 
points  of  change.  The  algorithm  could  be  modified  to  require 
-  Boc  at  Step  2,  so  that  a’  would  assiune,  in  turn,  the 
values  of  each  point  of  change  of  Bet;  or  one  could  require 
^  2,  so  that  a-  would  assume,  in  turn,  the 

values  of  each  point  of  change  of  Ax.  The  minimum  require¬ 
ment  (the  one  adopted  here)  is  AX^S°OBa  at  Step  2,  and 
seems  more  symmetrical  and  less  arbitrary  than  either  of  the 
extraae  requirements  just  mentioned. 

proof  of  tto  Basic  naoreo,  it  is  clear  that  the  Basic 
Conceptual  Algoritte  can  be  paraphrased  as  follows. 

Step^:  5;^  any  convenient  method,  find  the  optimal  solution 
(r»(0),  u»(0))  of  (Po).  set  a°  =  0,  S°  equal 
to  an:,  set  valid  at  a  =  0,  and  (x,u)°  =  (^(o),  u*(0)). 

^tep  2.  Solve  (=s  )a  as  a  increases  above  a°  for  its 

unique  continuous  solution  satisfying  the  left  end-point 
condition  (x^°(a°),  =  (x,u)°,  namely 

u^(a)),  until  either  a  =  1  or  a  point  of 
CG_n^e  a  of  Ax  or  Ba  is  enco\uitered  to  the  right 
of  which  s°  is  no  longer  valid.  In  the  first  case. 


in  the  second  case>  set  (x,u)°  =  (x*(Of')» 
u*(Qt* )  )  and  go  to  Step  5* 

&tep^:  Aaong  all  sets  valid  at  a’,  find  one  which  is  valid  to 
the  right  of  a*.  Call  it  S'.  Set  a°  =  a',  S°  =  S', 
and  return  to  Step  2. 

See  Api>endix  B  for  graphical  illustrations  of  this  algorithm. 

Jfov  that  the  Basic  Conceptual  Algorithm  has  been  theoretically 
Justified,  ve  take  up  computational  considerations. 

5-  A  Basic  Computational  Algorithm 

In  order  to  iasplement  the  Basic  Conceptual  Algorithm,  it  is  necessary 
to  have  a  method  of  actually  solving  (=S)a  as  a  changes  parametric¬ 
ally.  Cteily  in  certain  simple  cases  is  it  possible  or  economical  to 
solve  these  equations  analytically,  and  so  usually  numerical  methods 
mist  be  used.  We  recoenend  Newton’s  method,  or  a  variation  thereof, 
as  an  efficient  means  of  solving  (=S)a  on  a  digital  computer  as  Qt 
changes  by  snail  discrete  Jumps. 

After  proving  the  applicability  of  Newton’s  method,  we  state  and 
prove  a  3&sic  Compiitational  Algorithm.  Seme  necessary  computational 
refinssents  are  then  briefly  indicated,  with  further  details  being 
added  in  Appendix  C. 

3-1  Newton's  Method 

JSewton's  fsethod  is  briefly  reviewed  in  Appendix  C,  Under  Conditions 
1  throu^  !^,  it  is  easily  seen  from  Iheorem  C.l  of  Appendix  C  and  the 
proof  of  Theores  2  that  for  each  e  [0,1],  Newton’s  method  applied 
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terminate;  in  the  second  case,  set  (x,u)°  =  (x*(a*), 
u*(a’))  and  go  to  Step  3. 

Step  3:  AK>ng  all  sets  valid  at  a',  find  one  which  is  valid  to 
the  ri^t  of  a'.  Call  it  S'.  Set  a°  =  a',  S°  =  S', 
and  return  to  Step  2. 

See  Appendix  B  for  graph! ceLL  illustrations  of  this  algorithm. 

Sow  that  the  Basic  Conceptual  Algorithm  has  been  theoretically 
justified,  ve  take  up  computational  considerations. 

5.  A  Basic  Computational  Algorithm 

In  order  to  i»pl«aent  the  Basic  Conceptual  Algorithm,  it  is  necessary 
to  have  a  method  of  actually  solving  {=S)a  as  a  changes  parametric¬ 
ally.  Only  in  certain  simple  cases  is  it  possible  or  economical  to 
solve  these  equations  analytically,  and  so  usually  numerical  methods 
must  be  used.  We  recotmaead  Sewton's  method,  or  a  variation  thereof, 
as  an  efficient  means  of  solving  {=S)ot  on  a  digital  ccanputer  as  ct 
changes  by  small  discrete  jumps. 

After  proving  the  applicability  of  Newton's  method,  we  state  and 
prove  a  nasic  Ccoputational  Algorithm.  Seme  necessary  computational 
refinements  are  then  briefly  Indicated,  with  further  details  being 
added  in  Appendix  C. 


3- 1  Newton' s  Kethod 

Sewton's  Esethod  is  briefly  reviewed  in  Appendix  C.  Under  Conditions 
1  through  h,  it  is  easily  seen  from  Theorem  C.l  of  Appendix  C  and  the 
proof  of  Theorem  2  that  for  each  c  [0,1],  Newton's  method  applied 
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to  (=s)a^  is  veil-defined  and  quadratically  convergent  to  (x*(Cii  ) , 
if  S  is  valid  at  and  if  the  starting  point  (x,u)°  is 

in  a  sufficiently  snail  neighborhood  of  (x*(Q!^)  ,  )  •  Since 

(£»(a),  u*(a))  is  continuous,  by  taking  £CL  small  enough 
u*{a^-^))  is  such  a  starting  point.  In  other  words,  Newton's  method 
is  apiplicable  point  by  point.  Does  there  exist  .^  >  0  such  that  a 
conputational  algorithtt  can  be  designed  using  Newton's  method  to  solve 
(=S)a  with  ixt  as  a  fixed  step  size  throughout?  The  Einswer  is 
affirmative,  and  reqxiires  a  proof  that  the  size  of  the  neighborhoods 
nentioned  above  nay  be  taken  to  be  bounded  away  from  zero. 

Theoreg  ^t.l; 

let  Conditions  1  through  U  hold,  let  c  [0,1]  not  a  point  of 

change  be  fixed,  and  let  S  be  valid  at  a  . 

o 

Then  there  exists  a  scalar  r*  >  0,  which  does  not  depend  on 
®o  such  that  Newton's  method  applied  to  equations  (=S)a 

is  well-defined  and  quadratically  convergent  to  (x*(a^)  ,  u*(a^) ) 
if  the  starting  point  (x,u)°  is  in  the  (n-Ha  dimensional)  neighborhood 

Proof: 

1.  'me  shall  use  the  notation  and  observations  immediately  following 
the  proof  of  Theorem  5.  To  prove  this  theorem  it  is  sufficient  to 
show  that  for  each  j  (j  =  0,...,R)  there  exists  a  scalar  r(j)  >  O 
such  that  the  following  assertions  hold  on 

for  any  fixed  a_  c  [al,  a’A  and  any  S  valid  on  [a'.,  a'  J: 

J  j+1  j  j+1 


m 


(a)  The  left-hand  side  of  each  equation  of  (=S)ct^  is 
tid.ce  continuously  differentiable  with  respect  to 
(x,u) - 

a((=s)a  ) 

(b)  The  Jacobian  —5-7 - - —  d  0. 

a(x,u) 

(c)  A(x,u;  Of^,S)  <  L  <  1,  where  A(x,u;0(^,S)  is  a  certain 

upper  estimate  of  the  norm  of  the  Jacobiein  matrix  of  the 
iteration  function  derived  by  applying  Newton's  method 
to  (=S)a^  (  see  section  1  of  Appendix  C) . 

Tb  see  why  this  plan  is  sufficient,  let  r'  =  Min{r(o) , . . . ,r(N)  ), 

let  €  [0,1]  not  a  point  of  change  be  fixed,  and  let  S  be  valid 

at  a^.  Then  for  some  j  between  O  and  N  we  have  that  S  is 

valid  on  [aj,  and  e  [a^,  Applying  Theorem  C.2  of 

Appendix  C,  we  see  that  Newton’s  method  applied  to  (=S)a^  is  well- 

defined  and  quadratically  convergent  to  >  u*(a^))  if  the 

starting  point  (x,u)°  e  }i*(a^)) . 

2.  Let  j  be  fixed,  O  <  j  <  H,  let  e  [a^,  and 

let  S  be  anj^  set  which  is  valid  on  [al ,  al  ]. 

J  J+1 

'Sy  Condition  1,  the  left-hand  side  of  each  equation  of  (=S)o( 

o 

is  twice  continuously  differentiable  with  respect  to  (x,u)  on  some 
open  neighborhood  of  >  u*(a^) ) . 

S{(=s)a  ) 

The  Jacobian  — ^  o  at  (2c*(a  ),  u*(a  ))  by  the  proof 

of  Theorem  2.  As  a  consequence  of  Condition  1,  this  Jacobian  is  con- 

tinuo'us  with  respect  to  (x>a)  on  some  open  neighborhood  of  (x*{a  ) , 

concludes  that  the  Jacobian  does  not  vanish  in  some  open 

neighborhood  of  (r»(a  ) ,  u*(a  ) ) . 

—  0  —  0 


o 


It  can  toe  sho*m  in  a  straightforvard  nanner  (see  Henrici,  1964, 
p.  106)  ttoat  A(x,u;  a^,S)  vanishes  at  (x*(a^),  u*(a^)).  By  Condition  1 
this  function  is  continuous  with  respect  to  (x,u)  on  some  open  neigh- 
toorhood  of  One  concludes  that  A(x,u;  cr^>S)  <  L, 

where  0  <  L  <  1,  on  some  open  neighborhood  about  (x*(a^),  u*(a  )). 

SucEarining  this  part  of  the  proof,  we  assert  that  (a),  (to), 
and  (c)  hold  on  some  open  neighborhood  of  (^(ot^) ,  u*(a  ) )  when 
%  ^  “i+1^  ^  which  is  valid  on  [a',  a'.  ]. 

J  J  ’ 

5.  Since  (^(a) ,  u*(a))  is  continuous  on  the  compact  set 
[ot!,  ],  the  image  set 

A 

r  =  {{x,3):  (x,u)  =  (x*(a)  ,u*(a) )  for  some  a,a'.  <  a  <  a'  } 

—  -  '  j  _  _  j+i-" 


is  caxp&ct.  It  follows  from  the  compactness  of  f  and  the  result 
of  i^t  2  of  this  proof  that  there  exists  a  scalar  r(j)  >0  such 


that  (a),  (b),  and  (c)  hold  on  u*(a^))  when 

3^  €  IQd,  ®  which  is  valid  on  [a^. , 

When  Conditions  i  through  4  hold,  we  define  to  be  the  minimum 

distance  between  any  two  points  of  change  on  [0,1],  and  to  be 

the  length  of  the  shortest  of  all  the  intervals  la  defined  in  Theorem  2 

o 

app^^ed  every  point  of  change  on  [0,1]  with  each  set  which  is 


valid  at  each  point  of  change.  Define  ^  ^  Note  that 

s  s 

(x  (a),  u  (a))  is  uniqpiely  defined,  by  TTieoresn  2  ajplied  at  a^, 

~  for  any  1  <  j  <  N  and  any  S  valid 
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Itoeoreaa  4.2  » 

let  Conditions  1  through  1*^  hold,  let  a'  e  [0,1]  be  a  partic¬ 
ular  point  of  change,  and  let  S  be  valid  at  a' . 

Then  there  exist  scalars  r"  >  0  and  0  <  i"  <  I,  which  do 
not  depend  on  a’  or  on  S,  such  that  Newton's  method  applied 
to  (=S)a^  is  we3_l- defined  and  quadraticaLLly  convergent  to 

*^o  ^  0f’+/"]  and  if  the  starting  point 

(x,n)°  €  N^„(x^(q!^),  u®(a^)). 

Proof; 

1.  Since  there  is  a  finite  number  of  points  of  change  on  [0,1] 

and  a  finite  mmber  of  valid  sets  at  each,  it  is  sufficient  to  show 
that  the  theorem  holds  with  r  and  I  possibly  depending  on  «' 
and  S.  Ihis  will  be  done  by  applying  Theorem  C.2  of  Appendix  C, 

2.  Let  a'  €  [0,1]  be  a  partictLlar  point  of  change,  and  let 

S  be  valid  at  a'.  It  remains  to  demonstrate  the  existence  of  scalars 

r  >  0  and  0  <  £  <  £  such  that  the  following  three  assertions  hold 
S  S 

on  (cCq))  when  e  [a'-t,  a'+i]: 

(a)  The  left-hand  side  of  each  equation  of  (=S)a^  is 
twice  differentiable  with  respect  to  (x,u) . 

S((=s)a  ) 

(b)  The  Jacobian  -  /  0. 

S(x,u) 

(c)  A(x,u;  0t^,S)  <  L  <  1. 

5.  In  view  of  the  fact  that  (x®(a'),  u®(a'))  =  (x*(a'),  u*(a')), 
we  may  argue  as  in  part  2  of  the  proof  of  Theorem  4.1  that  (a),  (b) , 


86 


aj»d  (c)  bold  for  =  o'  on  some  open  neighborhood  of  (x®(a'), 
U®(Cf')). 

s  s 

4^.  Since  (x  (o),  u  (o))  is  continuous  on  the  closed  interval 
lO*,  and  therefore  unifomly  continuous,  one  may  assert  the  existence 
of  scalars  r  >  0  and  O  <  i  <1  such  that  (a),  (b) ,  (c)  hold  on 
If  (x®(aj,  u^(a  ))  vhen  o  c  [o'-i,  o'+i]. 

3y  specialising  OSieorea  4.2  to  o^  =  o',  and  recalling  that 
(x®(a'),  u^(a'))  =  (^(o'),  u*(a'))  when  S  is  veLLid  at  o',  it 
is  evident  that  Tbeoraa  4.1  is  still  true  if  o^  is  permitted  to  be 
a  point  of  change.  Since  (3^(0),  u*(a))  is  continuous  on  [0,1], 
it  is  uaiforaly  continuous  on  [0,1],  £uid  one  immediately  obtains 
the  following  corollary  of  Iheorem  4.1. 

Corollary  4.1: 

Let  Conditions  1  through  4  hold,  let  €  [0,1],  and  let  S 

be  valid  at  a  . 

o 

3hen  there  exists  a  scalar  6'  >  0,  which  does  not  depend  on 
or  on  S,  such  that  ITewton' s  method  applied  to  (=s)a^  is 

well-defined  and  quadratically  convergent  to  >  3i*(0fQ) ) 

if  the  starting  point  is  (x*(a^-6) ,  u*(a^-6))  and  |6|  <  6', 

O  <  a  -6  <  1. 

—  o  — 

A  similar  argument  shows  that  theorem  4.2  yields  the  following 
corollary. 
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Corollary  4.2: 


Let  Conditions  1  throvigh  k  hold,  let  a'  e  [0,1]  be  a  particular 
point  of  change,  *uid  let  S  be  valid  at  o'. 

Then  there  exist  scalars  6"  >0  and  0  <  i"  <  7,  which  do  not 
depend  on  o*  or  on  S,  such  that  Newton's  method  applied  to 
(=S)a^  is  well-defined  and  quadratically  convergent  to 

if  ot^  €  [a'-i",  o’+i"]  and  if  , 

u*(0f^_5))  is  the  starting  point  and  |5|  <  6",  0  <  O  -6  <  1. 

5-  2  The  Basic  Coeaputational  Algoritim 

Using  the  results  of  the  previous  subsection,  we  can  design  a 
cccputational  counterpart  of  the  Basic  Conceptual  Algorithm  by  using 
ITevton's  nethod  to  solve  (=S)a  as  O  increases  by  steps  of  size 
l£X.  A  useful  idealization  is  obtained  by  assuming  that  there  is  no 
casputationai  error.  In  view  of  the  quadratic  nature  of  the  convergence 
of  Hewton's  method,  it  is  no  less  plausible  to  assume  that  Newton's 
method  comrerges  to  an  exact  solution  of  (=S)a  when  it  theoretically 
should  converge.--'  An  annotated  flow  chart  of  the  Basic  Cranputational 
Algorithm  is  given  in  Figure  1. 

Theorem  5  ; 

Assume  that  Conditions  1  through  U  hold,  that  there  is  no  compu- 
tatiwnaj.  error,  and  that  Hewton's  method  converges  to  an  exact  solution 
of  (=S)a:  when  it  theoretically  should  converge. 

s  ■  _  I  ; - . 

This  assizsptioa  is  strictly  true  only  when  f  and  are  quadratic 

polynctmi^E  and  all  constraints  are  linear,  in  which  case  (=S)a  is  a 
set  of  linear  equations  in  (xju)  and  Newton's  method  therefore  leads 
"to  Eointion  in  3,  single  i'tsr&'tioTi* 
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Then  there  exist  e  >  O  and  ^  >  0  such  that  the  Basic  Compu¬ 
tational  Alsoritim  is  well-defined  and  will  terminate  with  t/vy  =  1 
in  a  finite  number  of  coeaputational  steps. 

Proof:  Put 

e,  = -i  Mfn  {u*(a*)  }  . 
i  €  Act! 

IT,  construction,  >  O.  By  the  uniform  continuity  of  U'¥’(0£) 

(i  =  l,...,n)  on  [0,1],  there  exists  a  scalar  6^  >  0  such  that 

lowr^l  <  5^  implies  |u*(a)-u»(a^)  |  <  ^  (i  =  1, . . .  ,m)  for  any  j 

(j  =  1, . . .  jir) . 

e2  =  3  Min  g.  (x*(a’))  . 

i  Ba! 

J 

By  construction,  >  0.  By  the  uniform  continuity  of  g^(x*(a)) 

(i  =  1,  on  [0,1],  there  exists  a  scalar  62  >  0  such  that 

Ict-a'I  <  6  implies  |g.  (x*(a))  -  g.  (x*(a'.))l  <60  (i  =  1,  ...,m) 

1—  J  d 

for  any  J  (j  =  1, 

Put  €*  =  J^inCe^jCg)  and  =  1/k,  where  K  is  the  smallest 

integer  satisfying  K>  2/mn{6j^, 62,6' } .  In  view  of  the  Basic 
Theorem,  to  prove  this  theorem  it  is  sufficient  to  show  that  for  these 
choices  of  e  and  ^XX  IJewton's  method  is  well-defined  and  sure  to 
be  convergent  as  stated  in  Steps  2  and  5,  and  that  the  trials  at  Step  3 
must  lead  to  a  success. 
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At  each  application  of  Newton's  method  during  Step  2, 

=  (x*(  ,  u*((J-i)^))  and  S°  is  valid  at 

a  =  (J-l).2!a.  If  S°  is  vaQ.id  at  then  since  <  B'  we 

have  by  Corollary  4.1  that  Newton's  method  is  well-defined  and  con¬ 
vergent  to  (3^(J^),  u*(jj^)).  If  S°  is  not  valid  at 
then  since  £ot*  <£”<  I  there  must  be  exactly  one  point  of  change 
a'  <  1  on  [(j-l)^,  J^];  but  S°  is  valid  at  a',  <  i", 

and  l£t*  <  S',  so  by  Corollary  4.2  Newton's  method  is  well-defined 

oO  oO 

and  ctHivergent  to  (x  ( J^) ,  u  (j/^oc)),  and  Step  5  is  entered.  By 

the  choice  of  e»,  A  =  Aa'  and  B  =  Bcr'.  Corollary  4.2  again  applies, 

and  ensures  that  Newton's  method  is  well-defined  and  convergent  to 

s  s 

(x  (jaa),  u  (Jda))  when  ACS  QB.  The  trials  are  sure  to  lead  to 
a  success  because  some  set  which  is  valid  at  a'  must  also  be  valid 
at  J£a,  since  a'  is  the  only  point  of  change  on  [(j-l)^,  Jda]. 

A  word  is  in  order  about  the  consequences  of  taking  e  and  ^ 
different  from  e*  and  This  is  of  considerable  practical 

isirartance,  since  e*  and  dcr*  cannot  be  calculated  beforehand. 

It  is  possible  to  give  a  detailed  discussion  of  the  difficulties 
caused  in  the  Basic  CoBsputational  Algorithm  by  "poor"  choices  of  e 
and  Aat,  but  we  shall  limit  the  present  discussion  to  a  few  general 
remarks. 

It  is  clear  from  the  proof  of  the  theorem  that  when  e  =  €*,  any 
A3t  <  £iQt*  wiil  do;  in  fact,  to  every  e,  0  <  e  <  €*,  there  exists 
£££*{e)  ,  0  <  £a*(e)  <  ^3ct*,  such  that  the  Basic  Computational  Algorithm 
is  well-defined  and  computationally  finite  when  e  and  ^  are  used 
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Solve  (Po).  Put  {x°,u°,s°)  equal  to  a 
solution  of  (KT-1)  through  {KT-4)  at  ot=0 


Choose  step  size  ££X  >  0 
Choose  e  >  0 
Put  J  =  0 


1?  ^  Iterate  from  ^,u'^~^)  to 

j  -^s  ^  solution^  of 

Terminate  (=sO)JAa,  by  Nevton’s  method 

(s"is  still  valid  at-  1 


Write  j 

(x*(j Aa)  ,u*( JA  a)  ) 
“  J  J.  “  ' 


Is  g  (x*^)  >0,  V  i  s°,  and  ^ 


T  1  -  — 

u.  >  0,  Vie  S°? 
1  ' 


Put  A  =  [i:  >  e) 

and  B  =  (i:  gj^(x'^"^)  <  e} 


Choose  S  such  that  A  d  S  d  B 
and  S  not  tried  before  at  the 
current  value  of  J 


( S  15  A>»r 
V/JL/O 


Bit  £ 


Iterate  from  (x"  ■^)  to  (x“  ,u“ ) ,  the 

solution^  of  (=S  )JA3t,  by  Newton's  method 


Yes  f  ®i^-  ^  ^  ^  i  ;e  S  ,  and 

(T„,.  If  i  '  S  ^ 

aott»ssFiiL;  S  /X  ^ - - - - - 

V^LIO  J 


Figure  1 

Fiow  Chart  of  the  Basic  Computational  Algorithm 


wms  /Vo  7" 

S'>ec«'ssPe/L  ;  S  /s  /ver 
VPLIO  AT  T^ioC) 


The  notation  used  here  is  contradictory  of  that  used  elsewhere 
in  this  work:  actually  means  (x^°{jAi)  ,u^°{JAi) )  at 

Step  2,  for  example- 


^  <Aot*(e).  'Dius  e  and  need  not  be  exactly  c*  and 

^  in  order  for  the  algorithm  to  be  applicable.  In  general,  however, 
the  following  qualitative  assertions  hold:  (a)  when  c  is  too  small, 
there  aay  be  too  few  candidate  sets  at  Step  3,  i.e.,  there  may  be  no 
set  satisfying  AOS  CB  which  is  valid  at  J^,  so  that  Step  3 
cannot  be  successfully  completed;  (b)  when  e  is  too  large,  there 
aay  be  too  many  candidate  sets  at  Step  3,  resulting  in  an  excessive 
Ewaber  of  trials  before  Step  3  is  successfully  completed  and  possibly 
the  break-down  of  Newton's  method  (lack  of  convergence  or  lack  of 
existence  of  the  required  inverse  matrix)  for  the  trial  sets  which 
are  not  valid  at  JAX  and  do  not  satisfy  the  hypotheses  of  Corollary  4.2 
applied  at  the  point  of  change  just  before  J^;  (c)  when  £a  is  too 

small,  the  algorithm  is  applicable  but  requires  more  executions  of 
Step  2  increments  in  a,  thereby  reducing  the  efficiency  of  the  algorithm 
for  a  user  who  would  be  satisfied  with  knowing  (3^(0),  u*(a))  for  a 
Closer  grid  of  values;  and  (d)  when  Ca  is  too  large,  Nevrton's  method 
is  apt  to  be  ill-defined,  or  divergent,  or  convergent  to  the  wrong 
solution  of  (=S)ji:a,  and  it  could  happen  that  there  is  no  set  satis¬ 
fying  AQSQB  which  is  valid  at  Jda,  so  that  Step  3  cannot  be 
successfully  cocpleted. 

It  is  evident  that  e  and  Cct  must  be  selected  by  trial  and 
er3*or.  A  more  powerful  approach  would  be  to  modify  e  and  adaptively 

as  the  cccpuxations  proceed:  one  would  provide  for  monitoring  the  number 
of  iterations  used  by  Newton's  method  each  time  it  is  employed  and  also 
the  number  of  candidate  sets  at  Step  3,  and  the  basic  strategy  would 
be  to  increase  £a  arai/or  decrease  e  when  the  algorithm  is  making 
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good  progress  and  to  decrease  ^  and/or  increase  e  when  the  algorithm 
enc«xiiiters  difficulty.  Such  an  approach  was  applied  successfully  in  the 
design  of  the  machine  code  used  to  solve  the  parametric  problem  of 
Chapter  IV. 


In  addition  to  the  possibility  of  increasing  computational  effi¬ 
ciency  by  adaptive  selection  of  e  and  A.,  it  is  possible  to  greatly 
improve  computational  efficiency  by  using  refinement,  bordering,  and 
partitioning  methods  for  the  inverse  matrix  required  by  Newton's  method. 
A  discussion  of  some  of  these  devices  is  given  in  Appendix  C.  These 
devices,  or  others  like  them,  should  be  incorporated  into  any  machine 
code  for  implementing  the  present  algorithm  ,  or  the  number  of  matrix 
inversions  required  would  probably  preclude  the  use  of  Newton’s  method. 


Further  Study  of  Step  3 

Step  3  of  the  Basic  Conceptual  Algorithm  involves  a  certain  amount 
of  trial  and  error:  at  the  point  of  change  a’,  try  different  sets 
S  which  are  valid  at  a’  (i.e.,  Az'  CS  QBa’)  until  one  is  found 
.•hich  is  va.id  to  the  right  of  a’.  When  Ba’-A.’  is  a  singleton,  then 
no  erroneous  trials  will  be  made  at  Step  5;  for  there  are  only  two 
eligible  sets,  one  of  which  was  found  at  Step  2  not  to  be  valid  to  the 
rxght  of  a'.  When  Ba'-A*’  contains  many  constraint  indices,  however, 
many  unsuccessful  trials  may  have  to  be  made  before  a  set  which  is  valid 
to  the  right  of  a’  is  found.  It  is  therefore  of  interest  to  appraise 
how  serious  a  difficulty  the  trial  and  error  nature  of  Step  3  is  likely 

to  be,  and  to  consider  some  ways  of  ameliorating  this  potential  stumbling 
block. 
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It  is  possible  to  argue  heuristically  that  B3'-A3£',  which  may  be 


referred  to  as  the  set  of  degenerate  constraints  at  ot' ,  will  ordinarily 
consist  of  only  one  constreiint.  Let  e  [0,1]  be  fixed,  and  assume 
that  Conditions  1  through  I*-  hold.  From  the  sufficiency  of  the  Kuhn- 
5^lcker  Theoren,  it  follows  that  x*(a^)  also  is  the  optimum  solution 
to  the  problem 


Maximize  f(x;a  )  subject  to  g.(x)  >0,  V  i  e  A3£  . 

X  ~  °  1  -  -  o 

In  other  words,  all  constraints  except  those  of  are  redundant; 

The  fact  that  seme  of  them,  namely  those  of  B0£  -Pa  ,  happen  to  be 

o  o 

exactly  satisfied  at  can  be  viewed  as  an  "accident."  It  seems 

more  likely  that  a  redundant  constraint  will  be  slack  at  as 

those  of  are.  If  is  not  a  point  of  chemge,  we  conclude 

that  is  likely  to  be  empty  (Ba^-ADt^  =  0  implies  that  there, 

is  exactly  one  valid  set  at  a  ) .  The  set  Ba  -Pa  is  sure  to  contain 

o  o  o 

at  least  one  constraint,  however,  when  is  a  point  of  change,  for 

as  or  traverses  the  unit  interval  continuity  dictates  that  the  only 
way  a  constraint  can  make  the  transition  from  slack  to  active  or  con¬ 
versely  is  to  pass  through  Ba-Aa.  Unless  there  is  strong  interdependence 
between  different  constraints,  not  more  than  one  or  two  constraints  are 
likely  to  be  involved  in  such  a  transition  at  any  given  point  of  change. 

Remark:  The  last  observation  bringsup  an  interesting  point  regarding 

the  testing  of  new  mathematical  programming  algorithms.  Often 
a  new  algorithm  is  applied  to  a  number  of  problems  whose  data 
were  generated  "randomly"  in  an  effort  to  gain  computational 


experience  quickly  and  to  judge  the  efficiency  of  the  algorithm. 
In  our  case  this  procedure  would  very  likely  lead  to  results 
biased  in  favor  of  our  algorithm.  The  reason,  of  course,  is 
that  "interdependence"  between  constraints  is  less  likely  to 
occur  when  problem  data  are  generated  randomly  than  when  problem 
data  derive  from  real  applications;  the  result  is  that  Step  3 
will  reirely  require  any  erroneous  trials  for  problems  with 
randomized  data. 

The  above  heuristic  argument,  although  somewhat  comforting,  does 
not  preclude  the  possibility  of  Ba’-A2£'  being  quite  numerous  (by 
Condition  4,  Ba  can  be  conqposed  of  at  most  n  constraint  indices, 
and  so  aa'-Az'  could  have  up  to  n  constraints).  Faced  with  this 
possibility,  one  may  follow  two  main  coixrses  of  inquiry.  One  may 
attempt  to  construct  methods  of  perturbing  (Pa)  so  as  to  ensure  that 
B3~AX  consists  of  only  one  or  two  constraints  at  each  point  of  change 
(see  Markokutz,  1956,  p.  125,  and  Zahl,  1964,  p.  156) .  Alternatively, 
one  may  attrapt  to  devise  rules  for  deciding  in  what  order  the  trials 
should  he  made  at  Step  3  (the  Basic  Conceptual  Algorithm  is  ambiguous 
in  this  respect)  so  as  to  tend  to  keep  the  number  of  erroneous  trials 
small.  We  choose  to  follow  the  second  course  of  inquiry,  because 
(a)  this  type  of  investigation  is  conspicuously  lacking  at  present 
(for  a  notable  exception  in  the  context  of  a  related  problem  see  Theil 
and.  Van  de  Panne,  i960) ,  and  (b)  the  second  course  of  inquiry  must  be 
undertaken  before  the  need  for  perturbation  can  be  established. 
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Ij.  1  Prelininary  Remarks  on  Determining  the  Order  of  Trials  at  Step  3 
We  begin  by  establishing  some  terminology.  Suppose  that  Step  2 
has  ended  with  the  point  of  change  a'  <  1.  Let  a'+  be  a  point  between 
a*  and  the  next  largest  point  of  change.  If  S  is  valid  at  a'  but 
not  at  Cl'+,  the  unique  continuous  solution  of  (=S)0£  satisfying  the 
left  end-point  value  u*(a'))  violates  either  (KT-5)  or 

(KT-4),  or  possibly  both,  as  Of  increases  above  a'.  In  other  words, 

S  '"causes  an  aiarm”  as  a  increases^/  above  a' .  A  violation  of 
(Kf-5)  is  called  a  feasibility  alarm,  while  a  violation  of  (KT-4)  is 
called  an  opt^^ria  1  ity  alarm.  5y  continuity,  the  set  of  feasibility  alarms 
must  be  contained  in  B3’-S,  and  the  set  of  optimality  alarms  must  be 
contained  in  the  set  S-A3£' ;  hence  all  alarms  are  from  BOt’-AX'.  Since 
S  is  not  valid  at  a'+,  by  Corollary  1.2  either  (S-Ba'+}  ^  0  or 
{Aa'+  -  S)  0-  The  set  S-B3f'+  will  be  called  the  excess  of  S  at 
a*+,  and  Aa'+  -  S  will  be  called  the  deficiency  of  S  at  a'  +  . 

Clearly  the  smallest  change  in  S  which  will  result  in  a  set  which 
is  valid  at  a'+  is  to  delete  its  excess  and  add  its  deficiency.  The 
timber  of  constraint  indices  of  {A3t'+  -  S]  U  {S-Ba'+}  is  therefore 
a  measure  of  the  minimum  distance, which  we  denote  by  d(s) ,  between 
S  and  the  collection  of  all  sets  which  are  valid  at  a'+. 

W  Since  x?{a)  and  uf  (a)  are  analytic  functions,  there  is  an  e  >  0 
such  that  each  component  of  (g(x^ (a)  ,u^ (a) )  has  constant  sign  on 
(a',a'+e).  It  is  in  this  sense  that  we  define  the  alarms  caused  by 
S  "as  a  increases  above  a' . " 

— ^  The  distance  between  a  set  C  and  a  set  D,  where  C  and  D  are 
both  subsets  of  M,  can  be  defined  as  the  number  of  elements  in  the  set 
{C-D)  O  {I>-C).  It  is  readily  verified  that  this  definition  meets  all  of 
the  usual  requirements  of  a  distance  metric  and  hence  makes  a  metric 
space  out  of  the  set  of  all  subsets  of  M. 
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Figiufe  2  is  designed  to  help  the  reader  visualize  the  various 
sets  mentioned  above  for  a  hypothetical  case,  and  it  will  be  convenient 
to  refer  to  it  occasionally  during  the  rest  of  this  section.  Each  dot 
represents  a  constraint--fifteen  in  all.  The  constraints  in  S  are 
circled  to  distinguish  them  from  the  others.  Constraints  6,  8,  and 
10  are  labelled  "g"  to  signify  that  they  are  potentiaLL  feasibility 
alarms  (Ba'-S),  and  constraints  7>  9>  *uid  11  are  labelled  "u"  to 
signify  that  they  are  potential  optimality  alaurms  (S-A3').  The 
d'^ficiency  of  S  at  a'+  is  precisely  constraint  6,  and  the  excess 
is  constraint  11. 

Can  one  guess,  by  observing  which  feasibility  and  optimality 
alairms  S  causes  as  oc  increases  above  a',  what  changes  can  be  made 
in  S  in  order  for  it  to  be  vailid  at  a'+?  It  is  tempting  to  con¬ 
jecture  that  any  constraint  (in  S)  which  yields  an  optimality  alarm 
should  be  deleted  from  S,  for  it  is  well-known  (e.g. ,  see  Wilde,  I962) 
that  a  dual  variable  may  be  interpreted  as  giving  the  marginal  decrease 
of  the  value  of  the  objective  function  with  respect  to  an  increase  in 
the  ’’right-hand  side"  of  the  corresponding  constraint.  Similarly,  it 
is  tec^ting  to  conjecture  that  any  constraint  (not  in  S)  which  yields 
a  feasibility  alarm  should  be  added  to  S  in  order  that  it  remain 
satisfied  as  a  increases  above  a'.  If  this  line  of  reasoning  were 
correct,  then  by  deleting  the  constraints  which  yield  optimality  alarms 
and  adding  those  which  yield  feasibility  alarms,  one  could  obtain  from 
S  a  set  which  is  valid  at  a'+;  for  the  optimality  alarms  would 
coincide  with  the  excess  of  S  and  the  feasibility  alarms  would 
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coincide  with  the  deficiency  of  S.  Unfortunately  this  is  not  the  case, 
because  the  interactions  between  constraints  which  are  degenerate  at 
a'  have  been  ignored.  It  is  therefore  possible  to  construct  simple 
examples  (see  Appendix  B)  for  which  there  are  false  and  silent  alarms. 

By  a  false  alarm  we  mean  a  feasibility  alarm  which  is  not  from  the 
deficiency  of  S  at  ct'+  and  not  from  the  set  of  degenerate  constraints 
at  a'+,  or  an  optimality  alarm  which  is  not  from  the  excess  of  S 
at  a'+  and  not  from  the  set  of  degenerate  constraints  at  a'+.  By 
8  silent  feasibility  alarm  we  mean  the  absence  of  a  feasibility  alarm 
from  a  constraint  in  the  deficiency  of  S  at  a'+,  and  by  a  silent 
optimaiity  alarm  we  refer  to  the  absence  of  an  optimality  alarm  from  a 
constraint  in  the  excess  of  S  at  a'+.  In  terms  of  Figure  2,  a 
false  feasibility  alarm  would  be  an  alarm  frcan  constraint  number  10,  a 
false  optimality  alarm  wo'uld  be  an  alarm  from  7,  a  silent  feasibility 
alarm  would  be  the  absence  of  an  Eilarm  from  6,  and  a  silent  optimality 
alarm  woxad  be  the  absence  of  an  alarm  fron  11.  Note  that  the  alarms 
frocn  the  set  of  constraints  which  are  degenerate  at  0('  +  (BCl'+  -  A3t'+), 
if  any,  are  immaterial — for  the  presence  or  absence  of  these  constraints 
(numbers  8  and  9  in  Figure  2)  for  a  trial  set  does  not  affect  its 
validity  at  a'+. 

The  above  remarks  indicate  that  not  very  much  information  about 
what  constitutes  a  valid  set  at  Q!'+  can  be  gleaned  from  a  trial  which 
fails  at  Step  5.  Evidently  the  statement  of  Corollary  1.1  that 
Aa' C  Aa'  +  C  B3'+ C  B3'  is  about  as  strong  a  statement  as  can  be  made. 

As  has  already  been  pointed  out,  this  is  already  a  very  strong  statement 
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in  the  likely  event  that  there  are  only  a  few  degenerate  constraints  at 
ot'.  Yet  when  a  trial  set  fails  at  Step  5  there  is  one  clue  to  the 
identity  of  a  set  which  is  valid  at  a'+  that  can  be  salvaged:  at 
least  one  of  the  alarms  given  during  a  failure  is  from  the  deficiency 
or  excess  at  a*+  of  the  trial  set.  In  the  next  subsection  we  shall 
prove  this  fact.  The  result  will  then  be  used  to  devise  an  ordering 
of  trials  at  Step  5. 

^•2  Sharpening  Corollary  2.1 

Lemma  6.1: 

Let  a*  €  [0,1]  be  a  point  of  change,  let  S  be  valid  at  a', 
and  assume  that  Conditions  1  through  i*-  hold. 

Then  there  exists  a  convex  set  X'3X  and  an  open  interval 
containing  and  symmetric  about  a'  and  contained  in  la'  such 
that,  for  each  fixed  value  of  a  in  this  interval,  x®(a)  is 
the  optimal  solution  of 

Maximize  f(x;a) 
x  €  X' 

subject  to  gj(x)  =  0,  Vi  e  {S  -  s'^a) 
g^(x)  >0,  vie  S'^a  , 

where  s'^a  oCi  e  S:  u®(a)  >  O). 

ft^oof :  Arguing  as  in  Proposition  5  and  using  the  continuity  of 

n  (a)  and  the  fact  that  u  (o')  =  u*(a')  >0,  one  obtains  (here  we  employ 

~  o  ®  o 

the  notations  of  Proposition  5)  that  Max  I  (A^(f(x;a)  u?(a)g.(x))) 
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is  negative  on  X  x  a’  and  continuous  on  some  open  region  containing 

this  direct  product  set.  Qjr  the  compactness  and  convexity  of  X  x  a', 

it  follows  that  the  hessian  of  the  Lagrangian  function  f(xjCt)  + 
m  g 

is  negative  definite  on  s^ne  open  convex  region 

X’  X  I'a'  containing  X  x  a’ .  In  view  of  A. 5,  the  Lagrangian  function 
must  be  strictly  concave  with  respect  to  x  on  the  open  convex  set 
X'  for  each  fixed  value  of  a  e  I'a'. 

Now  x^(a')  =  3^(a')  cXCX*,  X'  open;  since  x®(a)  iscon- 
txn-aous  on  la',  one  obtains  that  x®(a)  e  X'  for  all  a  sufficiently 
near  a'.  Since  the  gradient  with  respect  to  x  of  the  Lagrangian 
function  vanishes  at  x®(a) ,  we  conclude  by  A. 6  that  x®(a)  is  the 
global  maximum  of  that  function  on  the  convex  set  X'  for  any  fixed 
a  sufficiently  near  a’.  Using  the  fact  that  u®(a)  =0,  V  i  S, 

g 

and  g^(x  (a))  =0,  V  i  e  S,  one  obtains,  for  any  fixed  a  suffi¬ 
ciently  near  a',  that 

(11+)  iCx^Ca);  a)  >  f(x;a)  uj(a)  g.(x),  V  X  e  X'  . 

1  ^ 

“  S 

Since  ^  u^(a)  g^(x)  >  0  for  all  x  such  that  g. (x)  =0,  Vie  (S-S^a), 
1  1  —  ' 

and  g^(x)  >0,  Vie  s'a,  where  s'^a  Q(i  e  S:  u®(a)  >  O),  the 

conclusion  of  the  lemma  follows  from  {ih) . 

Remark:  An  easy  proof  of  this  lemma  can  be  constructed  from  the  Kuhn- 
Tucker  Theorem  when  all  constraints  are  linear;  in  this  case 
X  may  be  taken  to  be  V/hen  all  constraints  are  linear, 

specialization  of  the  Kuhn- Tucker  Theorem  reveals  that  (=S)a 
are  necessary  and  sufficient  conditions  for  a  maxim\am  of  f(x;a) 
subject  to  g.{x)  =0,  Vies. 
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Remark;  IHie  region  X'  may  be  taken  to,  be  contained  in  the  open  region 
mentioned  in  Condition  1. 


Pieorem  6; 

Let  a'  €  [0,1]  be  a  point  of  cheunge,  let  S  be  veLLid  at  a', 
and  assvme  that  Conditions  1  through  4  hold. 

Then  there  exists  an  open  interveLL  containing  and  symmetric  about 
Of'  and  contained  in  lot'  such  that,  for  each  fixed  value  of  ot 
in  this  interval,  the  following  three  assertions  are  equiveLLent: 

(i)  S  is  valid  at  a. 

(ii)  (x®(a),  u®(a))  =  (2^(a),  u*(a)). 

(iii)  g^(x®(a))  >0,  Vie  (AOt-S) 
u®(a)  >0,  Vie  (S-BOf). 


Proof;  The  equivalence  of  (i)  and  (ii)  and  the  fact  that  (ii) 
implies  (iii)  are  known  from  Corollary  2.1.  To  complete  the  proof  of 
the  theorem,  it  is  sufficient  to  show  that  (iii)  implies  (ii)  on  the 
interval  mentioned  in  Lemma  6.1. 


Assume  that  (iii)  holds  for  some  fixed  value  of  a  in  the  interval 
mentioned  in  Lemma  6.1.  Using  the  assimiption  that  u?(a)  >  0, 

Vie  (S-Ba),  and  applying  Lemma  6.1  with  S^a  =  (S-BOt),  one  may 
assert  the  existence  of  a  convex  set  X' ZDX  such  that  x®(a)  is  an 
optimal  solution  of 


(15) 


Maximize  f(x;a) 
x  e  X* 


subject  to 


g^(x)  =  0,  Vie  {BaOs} 
g^(x)  >0,  Vie  (S-BOf)  . 
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c 

Using  the  assumption  that  g^(x  (a) )  >  0,  Vie  (Aa-S),  we  have 

g 

that  X  (a)  is  feasible  in 

Maximize  f(x;a) 

X  e  X’ 

subject  to  g^(x)  =  0,  V  i  £  {BOt  n  S} 

gj^(x)  >0,  Vie  {S-BOt}  U  ifa-S)  . 

Since  the  feasible  region  of  (l6)  is  included  in  that  of  (15), 

g 

X  (a)  must  be  an  optimal  solution  of  (l6) . 

It  follows  from  A.4  and  A.  6  and  the  fact  that  (x*(a) ,  u*(a) ) 
satisfies  (=Aa)a  that  x*{a)  is  optimal  in 

(17)  Maximize  f(xja)  subject  to  g.  (x)  >  0,  V  i  e  Aa  . 

X  €  X'  1  -  - 

Since  the  feasible  region  of  (l6)  is  included  in  that  of  (l?),  and 
since  3^(a)  is  feasible  in  (l6) ,  x'*^(a)  must  be  optimal  in  (l6). 

That  is,  both  ^(a)  and  x  (a)  are  optimal  in  (l6);  thus 

'®’)  ~  (Of)  JQ)  •  Because  x  (ot)  is  feasible  in  (I7),  therefore, 
we  finally  have  that  x^{a)  is  optimal  in  (I7) .  Since  (17)  must  have 
a  unique  optimal  solution  by  A.2,  x®(a)  =  x*(a) .  This  implies,  by 

Condition  4,  that  u^(a)  =  u*(a) .  Thus  (ii)  holds. 

“Ihe  significance  of  this  sharpening  of  Corollary  2.1  is  that 
it  rules  out  the  possibility  that  all  alarms  are  either  false  or  from 
the  set  of  degenerate  constraints  at  a'+  when  S  is  not  valid  at 
(X  +.  That  is,  at  least  one  alarm  is  from  the  deficiency  or  excess 
of  S  at  a'+. 
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U.3  Modification  of  Step  3 — Determining  the  Order  of  Trials 

Suppose  that  Step  2  has  ended  with  the  point  of  change  a'  <1. 
Designate  the  set  of  alaras  which  are  given  by  S°  (the  set  used 
d\aring  Step  2)  as  a  increases  above  Of'  by  T.  Applying  Theorem  6 
at  a',  we  know  that  at  least  one  of  the  alarms  is  frcan  the  excess 
or  deficiency  of  S°  at  a'+.  Unfortunately,  we  do  not  know  which 
one.  A  logical  way  of  proceeding  at  Step  3  is  to  modify  S°  by  one 
constraint  at  a  time  for  each  constraint  in  T,  i.e.,  try  the  sets 
+  i  for  each  i  e  T,  where  the  symbol  S°  +  i  means  S°  U  i 
if  i  S°  and  S°-i  if  i  e  S°.  This  notation  is  designed  to 

avoid  having  to  distinguish  between  feasibility  and  optimality  alarms. 

In  other  words,  add  the  constraints  which  were  feasibility  alarms  to 

o  o 

S  and  delete  constraints  which  were  optimality  alarms  frcan  S  one 

at  a  time  \intil  each  alarm  has  been  heeded  individueO-ly.  Note  that 

S°  +  i,  i  e  T,  is  valid  at  O'  since  all  alarms  caused  by  a  set 

which  is  valid  at  a’  must  be  frcan  B3'-Aa’.  Hence  S°  +  i,  i  e  T, 

satisfies  (8.l). 

When  T  has  been  exhausted  by  this  first  generation  of  trials, 

at  least  one  trial  set,  say  S°  +  i^,  is  one  unit  of  distance  closer 

to  a  valid  set  at  a'+.  If  d(S°)  =  1  then  S°  +  i  is  valid  at 

—  o 

a'+  and  Step  5  has  been  successfully  completed.  If  d(S°)  >  1  then 
d(S°  +  i^)  =  d(S°)-l  >  O,  and  a  second  generation  of  trials  is 
necessary.  At  each  first  generation  trial,  let  denote  the 

alarms  due  to  S°  +  i,  i  e  T.  At  the  second  generation  one  should 

try  S°  +  i  +  j  for  all  i  e  T  and  all  J  e  T^.  The  symbol  S°  +  i  +  J 
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means 


{S°  +  i}U  j  if  j^S  +i  and  {s°  +  i)  -  j  if  JeS°+i.  Applying 

Hieorem  6  at  a'  with  S  =  S°  +  i^,  we  see  that  at  least  one  of 

the  alarms  due  to  S°  +  i^  is  from  the  excess  or  deficiency  of 

S°  i  at  a'+,  but  we  do  not  know  which  one.  Hence  at  least  one 
—  o 

of  the  sets  S°  +  i  +  j  e  T.  ,  is  one  unit  of  distauice  closer  to 
—  o  —  1 

o 

a  set  which  is  valid  at  a*+.  Designate  one  such  set  by  S°  +  i  +  j  • 

—  0—0 

If  d(S°)  =  2  then  S°  +  i^  +  is  valid  at  a'+,  and  Step  J  has 
been  successfully  completed.  If  d(S°)  >  2,  then  d(S°  +  i^  +  J^)  = 
d(S°)-2  >  0,  and  a  third  generation  of  trials  is  necessary. 

The  third  generation  of  trials  is  constructed  in  a  manner  analogous 
to  the  preceding  generations,  and  so  on  for  the  higher  order  generations. 
If  at  any  trial  a  set  is  encountered  which  has  been  tried  before,  it 
may,  of  course,  be  discarded. 

At  each  generation  the  distance  from  some  trial  set,  and  perhaps 
several,  to  the  collection  of  aLIl  sets  which  are  valid  at  a'+  is 
decreased  by  one  unit.  Since  d(s°)  is  finite  (in  fact  it  is  bounded 
by  the  number  of  constraints  in  Ba'-Aa'  minus  the  number  of  constraints 
in  Ba'+  -  Aa:'+),  after  a  finite  number  of  generations  of  trials  a 
set  which  is  valid  at  a'+  will  be  obtained--after  exactly  d(S°) 
generations,  in  fact.  The  nearest  valid  set  is,  it  will  be  recalled, 

S°  plus  its  deficiency  at  a'+  minus  its  excess  at  a'+.  These  rules 
are  summarized  below. 

Order  of  Trials  at  Step  3 

1.  Let  T  denote  the  alarms  which  are  given  by  S°  as  Of 
increases  above  a’.  At  the  first  generation  of  trials. 
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try  S°  +  i  for  each  i  e  T.  Let  denote  the  set  of 

alarms  which  are  given  by  S°  +  i,  i  e  T,  as  a  increases 
above  a*.  If  =  0  for  some  i*  e  T,  then  S°  +  i* 

is  valid  at  0t'+,  and  Step  3  has  been  ccanpleted;  otherwise, 
go  on  to  the  second  generation  of  trisuLs. 


2.  At  the  second  generation  of  trials,  try  S°  +  i  +  J  for 

each  i  e  T  and  all  j  e  T^.  Let  be  the  set  of  alemns 

which  are  given  by  S°  +  i  +  j,  ieT  and  j  e  T^,  as  a 
increases  above  a*+.  If  T.  .  =  0  for  some  i*  e  T  and 

^t) 

j*  €  then  S°  +  i*  +  j*  is  valid  at  a'+,  and  Step  3 

has  been  completed;  otherwise,  go  on  to  a  third  generation 
of  trials. 

Etc.  (Omit  any  sets  which  have  been  tried  previously.) 


Since  the  only  modification  of  Step  3  being  suggested  here  is  a 
more  complete  specification  of  the  order  in  which  the  trial  sets  are 
to  be  considered,  and  since  this  order  has  been  shown  to  lead  to  a 
successful  completion  of  Step  3,  the  assertions  of  the  Basic  Theorem 
still  apply  to  the  Basic  Conceptual  Algorithm  with  Step  3  modified  as 
above . 

If  these  rules  are  to  be  incorporated  into  the  Basic  Computational 
Algorithm,  then  in  order  to  ensure  that  Theorem  6 — and  hence  the  above 
rules— applies,  it  is  necessary  to  take  ^  less  than  one-heilf  the 
length  of  the  smallest  of  the  intervals  of  theorem  6  applied  at  each 
point  of  change. 

O 
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We  do  not  hold  that  the  order  of  trials  suggested  here  is  the  most 
efficient  order  which  can*  be  devised.  However,  the  following  advantages 
are  to  he  noted: 

(1)  Each  unsuccessful  trial  helps  to  determine  the  order  of 
successive  trials. 

(2)  The  suggested  order  of  trials  always  leads  to  the  (unique) 
valid  set  nearest  S°. 

(5)  A  valid  set  is  found  after  exactly  d(S^)  generations  of 
trials.  In  this  sense  search  termination  is  predictable, 
although  not  a  priori  so. 

(4)  S°  is  deformed  one  constraint  at  a  time  from  trial  to 
trial,  so  that  the  computational  machinery  is  upset  the 
least  amoimt  possible. 

5*  Some  Extensions 

5.1  Linear  Equality  Constraints 

Let  the  constraints  of  (LQ)  include  some  linear  equality 
constraints.  It  is  clear  that  if  each  such  constraint  is  written  as 
a  pair  of  inequality  constraints  (i.e.,  if  the  pair  g^(x)  >  0, 

>  0  is  written  in  place  of  g^(x)  =  O) ,  then  Condition  4  never 
holds.  Fortunately,  it  can  te  shown  that  a  simple  modification  of 
the  Basic  Conceptual  Algorithm  obviates  this  difficulty:  always  include 
the  linear  equality  constraints  in  S°  at  Step  2  and  in  the  trial 
sets  at  Step  3  (ignore  any  optimality  alarms  that  such  constraints  may 
give).  If  all  of  the  constraints  happen  to  be  linear  equalities,  in 
fact.  Step  3  would  disappear  entirely. 
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5*2  More  General  Parametric  Problems 


With  appropriate  modifications  of  the  four  conditions,  it  can  be 
shown  (Geoffrion,  I965)  that  many  of  the  results  of  this  chapter  apply 
to  any  one-dimensional  perturbation  of 

• 

(l^)  Maximize  f(x,p)  subject  to  >  0  , 

X  — 

where  the  parameter  p  =  (Pj^,  • . .  ,Pj^)  varies  over  a  convex  set  P  in 
E  >  fCxjP)  is  continuous  in  (x>p)  and  strictly  concave  in  x  for 
each  p  e  P,  and  g^(x>p)  (i  =  1, . . . ,m)  is  concave  in  (x,p) .  Py 
a  one- dimensional  perturbation  of  (Pp)  we  mean  a  parametric  problem 
of  the  form 

Maximize  f(x,p*  +  a(p"  -  p')) 
x  ~ 

subject  to  g(x,p'  +  a(p"  -  £'))  >0 

for  each  value  of  a  e  [0,1],  where  p' ,  e  P. 

It  is  evident  that  (l^)  is  general  enough  to  include  meuiy  of 
the  parametric  problems  of  interest  to  those  who  wish  to  perform 
sensitivity  analysis  on  concave  programming  problems. 


108 


CHAPTER  IV 


An  Illustrative  Example 

A  simple  model  of  a  firm  will  be  used  to  illustrate  the  manipu¬ 
lation  and  solution  of  a  decision  problem  under  uncertainty  by  means 
of  the  techniques  presented  in  the  preceding  three  chapters. 


1.  A  Decision  Problem  Under  Uncertairty 

Consider  a  hypothetical  firm  which  produces  and  sells  n  products 
in  an  imperfectly  competitive  market.  Assume  that  the  cost  of  producing 
and  selling  each  unit  of  product  i  is  dollars  per  unit,  and 

that  the  total  dollar  revenue  accruing  from  the  sale  of  x.  units  of 
product  i  is  r^(x^)  =  (a^+b^P-d. )x.  +  (d./k^)in(k.x.+l) ,  where 
d^,  k^  are  positive  scalars,  in(')  denotes  the  natural  log, 
and  P  is  a  price  index.  The  interpretation  of  r^(x^)  becomes  clearer 
if  one  examines  dr  (x  )/dx  =  a  +b  P-d. +d./(k.x  +1) .  Since 
dr^(0)/dx^  =  ^d  dr^(«>)/dx^  =  a^+b^P-d^,  we  see  that  price 

gradually  decreases  from  (notice  the  linear  dependence  on 

the  price  index)  to  a^+b^P-d^  dollars  per  unit  as  production  increases 
Without  bound.  The  value  of  k^  determines  the  rapidity  of  the  price 
decrease,  and  it  is  easily  shown  that  a  proportion  0  <  t  <  1  of  the 

total  possible  price  decrease  d.  is  achieved  at  x.  =  t/(l-t)k  . 

1  1  '  '  i 

If  we  denote  the  (short-run)  resource  and  other  constraints 
(including  x  ^  s(x)  >  0,  then  assuming  that  the  firm  can 

sell  all  it  produces  the  profit  maximization  problem  is 


Maximize  ^  +  b^p  -  d^)x^  +  (d^/k^)in(k^x^  +  l)  -  c^x^ 


subject  to  g(x)  >  0  . 


109 


We  shall  assume  that  all  functions  and  coefficients  axe  known  except 
the  price  index  p,  which  will  be  regarded  as  a  random  variable  with 
a  known  cumulative  distribution  function  ®(P). 


2.  Circumventing  Uncertainty  by  a  Vector  Maximum  Reformulation 

In  order  to  circumvent  the  uncertainty  attending  the  objective 
function  of  (l) ,  we  elect  to  employ  one  of  the  approaches  considered 
at  some  length  in  Chapter  I:  a  vector  maximum  reformulation  using 
the  expected  value  criterion  and  the  maximum  .05-fractile  criterion 
(scane  fractile  other  than  the  .05-fractile  could  be  used  if  desired). 
Assume  that  4>(P)  is  continuous,  strictly  increasing  on  the  entire 
read  line,  and  that  its  mean  is  zero  (if  the  mean  is  not  zero,  it 
can  be  incorporated  into  the  a^) .  One  derives  that  the  mean  and 
.05-fractile  of  the  objective  function  for  fixed  x  are,  respectively, 

f'lW  =  *  '^i  "  '^i^^i  (d^Ai).«n(k^x^  +  1) 

fgCx)  =  (fiA)  +  4>‘^{-05)  Z  if  'D  Vi  -  ° 


f^(x)  +  4>'-^(.95)  Z  if  L  Vi  -  °  • 

i=l 


In  place  of  (l)  we  consider  the  vector  maximvim  problem 


"Maximize"  >  ^2^—^ 

X 

subject  to  g(x)  >  0 


The  efficient  outcomes  of  (2)  are  to  be  computed  and  plotted  (as  in 
Figure  3  below)  so  as  to  present  a  "tradeoff  curve"  between  the  two 
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criteria.  A  decision-maker  then  subjectively  determines  a  point  on 
the  tradeoff  c\irve,  and  implements  the  corresponding  optimal  production 
schedule. 

3.  An  Equivalent  Parametric  Programming  Reformulation 

The  hessian  of  f^^Cx)  is  a  diagoned.  matrix,  with  -k^d^/(k^x^+l)^ 
on  the  diagonal.  When  x  >  0,  the  assumed  positivity  of  k^  and 
d^  implies  that  this  hessian  is  negative  definite.  5y  A.  3,  therefore, 
f,  (x)  is  seen  to  be  strictly  concave  on  the  non- negative  orthant. 

An  envuneration  of  cases  shows  that  ^2^—^  also  strictly  concave 
on  the  non- negative  orthant  when  4>“^(.05)  <  0  and  *"^(.95)  >  0. 

In  view  of  our  assimption  that  the  mean  is  0,  it  is  reasonable  to 
ass\mie  that  this  last  condition  holds.  Assuming  further  that  each 
constraint  function  is  concave,  we  conclude  that  Proposition  6  of 
Chapter  II  applies. Hence  to  find  all  efficient  solutions  of  (2) 
it  is  equivalent  to  find  the  optimal  solutions  of 

Maximize  (l-a)fj^(x)  +  af^Cx) 

(3) 

subject  to  g(x)  >  0 

for  each  value  of  a  in  the  unit  interval. 

Consider  (3)  with  a  fixed.  The  presence  of  the  logical  con¬ 
dition  in  the  definition  of  f^  makes  the  solution  of  (j)  scxnewhat 

It  is  easy  to  see  that  Proposition  6  still  holds  if  the  f^  are 
assumed  to  be  strictly  concave  on  X,  and  not  necessarily  on  e'^. 


Ill 


awkward. 


One  approach  is  to  solve  the  pair  of  problems 


Maximize  (l-a)f^(x)  +  a[  f^(x)  _+  ,05)  ^ 

(^)  subject  to  g(x)  >  0 

b.x.  >  0 

J' 

Maximize  (l-a)f^(x)  +  a[f^(x)  +  'I'"^(.95)  b^x^] 

(5)  subject  to  g(x)  >  0 

2)  biX.  <  0  . 

The  optimal  value  of  (5)  equals  the  larger  of  the  optimal  values 
of  (4)  and  (5),  since  the  feasible  regions  of  (4)  and  (5)  are  merely 
a  dichotomy  of  that  of  (3) .  We  shall  avoid  this  complication,  however, 
by  requiring  of  our  numerical  example  that  b^  >  0  (i  =  1,  ...,n); 
since  x  >  0,  this  condition  implies  that  b^x^  >  0,  and  therefore 
(5)  may  be  rewritten  as 


Maximize  (l-a)f  (x)  +  a[  f  (x)  +  1>”^(.05)  T^b.x  ] 

(6)  i 

subject  to  g(x)  . 

4.  Solving  the  Parametric  Problem 

We  shall  solve  a  numerical  example  based  on  (6)  with  n  =  4 
and  m  =  7.  Table  1  gives  the  numerical  data  for  the  objective 


112 


function,  €Uid  the  constraint, 


are: 


■i  > 

H 

tl 

...  ,  4 

.Olx^ 

-•OlXg 

1 

? 

X 

-.04xj^ 

+ 

2  >  0 

.4  x^ 

-.4  Xg 

-.1x3 

-.1  Xj^ 

+ 

20  >  0 

<M  H 

o 

C\J  (M 

o 

1 

-.01x| 

0 

1 

+ 

15  >  0 

i  =  1 

i  =  2 

i  =  3 

i  =  4 

a. 

1 

10.0 

12.0 

10.5 

11.0 

b. 

1 

0.0634 

0.0950 

0.6740 

0.7540 

c. 

1 

8.0 

10.0 

8.5 

9.0 

^i 

2.50 

2.  55 

2.20 

2.25 

k. 

1 

0.12 

0.13 

0.045 

0.050 

Table  1-' 
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It  is  further  assumed  that  P  is  normally  distributed  with  zero 
mean  and  vuiit  varieuice.  Hence  4>”^(.05)  =  -1.64. 

It  is  clear,  since  >0  (i  =  1,...,4),  that  f^^,  f^ 

and  (i  =  1,...,7)  are  analytic  on  some  open  region  containing  the 

non-negative  ortheuit.  Because  the  constraints  are  concave,  therefore. 


— '  Each  represents  hundreds  of  \inits  of  pioduct  i.  The  last 

three  constraints  are  to  be  interpreted  as  constraints  on  three 
resources,  which  we  refer  to  as  resources  A,  B,  and  C  respectively. 
Resources  are  measured  in  thouseuids  of  units. 


•2/  The  units  of  the  coefficients  are  such  that  f,  and 
thousands  of  dollars.  ^ 


fg  are  in 
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Condition  1  of  Chapter  III  is  satisfied.  Since  resources  are  limited 
in  all  real  problems  of  this  type.  Condition  2  is  not  restrictive, 
and  in  fact  holds  for  the  feasible  region  of  our  niomerical  example. 

It  was  observed  above  that  the  hessian  of  f^  is  negative  definite 
on  the  non-negative  orthant,  and  the  same  is  true  for  fg,  so  that 
Condition  3  holds.  We  shall  not  bother  to  verify  whether  Condition  4 
is  satisfied  by  our  numerical  example. 

A  version  of  the  Basic  Computational  Algorithm  for  solving  (6) 
was  coded  for  the  Burroughs  B5000  computer.  No  attempt  was  made  to 
optimize  program  efficiency  beyond  the  incorporation  of  a  simple 
variable  step  size  feat\ire  (see  the  last  two  paragraphs  of  section  3> 
Chapter  III) .  The  results  of  the  computation  are  presented  in 
Figures  1,  2,  and  J.  Figure  1  is  a  graph  of  the  optimal  production 
schedule,  ^  function  of  a.  Note  the  markers  at  the 

following  values  for  a,  each  of  which  is  a  point  of  change  marking 
an  execution  of  Step  5:  0.6024,  0.7819,  0.8558.  Since  no  false  or 

silent  alarms  are  encountered  at  any  of  these  points.  Step  3  is 
executed  in  each  case  with  no  erroneous  trials.  Figure  2  presents 
graphs  of  u^(a)  and  g^(3^(a))  (i  =  5,6,7).  Note  that  the  dual 
vaxiables  (or  "shadow  prices")  u^(a)  (i  =  1,...,4)  axe  not  graphed, 
since  they  are  identically  zero  on  [0,1],  and  that  it  is  not  necessary 
to  graph  the  non- negativity  constraints.  Figure  5  is  a  plot  of  the 
efficient  outccmes  associated  with  the  two  criterion  functions--a 
tradeoff  curve.  It  shows,  for  example,  that  production  plan  x*(0.807) 
guarantees  a  profit  of  at  least  $52,700  with  probability  .95  and  an 
expected  profit  of  $79>100. 
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APPENDIX  A 


Some  Properties  of  Convex  Sets  and  Concave  Functions 

A  set  S  in  E  is  said  to  be  convex  if  (\x'+(l-\)x")  e  S 
whenever  x',x"  e  S  and  0  <  \  <  1. 

A  function  f(x)  which  is  defined  on  a  convex  set  S  is  said 
to  be  concave  if  f (X^'+(l-x)x")  >  \f (x> )+(l-X)f (x")  whenever 
x' ,x  e  S  and  0  <  X  <  1,  if  the  first  inequality  holds  strictly 
whenever  x'  ^  x"  and  0  <  X  <  1,  f(x)  is  said  to  be  strictly 
concave.  The  function  -f(x)  is  said  to  be  convex  or  strictly 

according  as  f(x)  is  concave  or  strictly  concave.  When  the 
convex  set  S  is  not  specified  explicitly,  it  is  implicitly  taken 
to  be  the  entire  space. 

The  following  properties  of  convex  sets  and  concave  functions 
are  used  in  the  text.  The  proofs,  most  of  which  follow  easily  from 
the  definitions,  may  be  found  in  Fenchel  (1955)  or  Zoutendijk  (i960). 

A.l  If  g^(x)  (i  =  l,...,in)  are  concave  functions  on  e“, 

then  (x:  gj^(x)  >0,  i  =  l,...,in}  is  a  closed  and  convex 
set. 

A. 2  Any  local  maximum  of  a  concave  function  on  a  convex  set  is 
also  a  global  maximum  over  that  set;  a  strictly  concave 
function  can  have  at  most  one  local  maximum. 

A.  5  A  twice- differentiable  function  defined  on  a  convex  set  S 
is  concave  if  and  only  if  its  hessian  matrix  is  negative 
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semideflnlte  at  each  x  e  S.  If  the  hessian  is  negative 
definite  at  each  x  e  S,  then  the  function  is  strictly 
concave  (the  converse  is  not  true  in  general,  but  does 
hold  when  the  function  is  a  quadratic  polynomial  and  S  =  e" 

If  ~  are  concave  functions  on  a  convex 

set  S,  and  u.  >0  (i  =  1, at  least  one  u.  >  0, 
k  ^  ~  1 

then  u  f  (x)  is  concave  on  S;  if  f.(x)  is  strictly 
1  “  k 

concave  for  some  i  such  that  u^  >  0,  then  ^  u^f.(x) 

is  strictly  concave. 

A. 5  A  concave  or  convex  function  on  a  convex  set  S  is  con¬ 
tinuous  at  every  relative  interior  point,  of  S. 

A. 6  If  f(x)  is  differentiable  and  concave  on  a  convex  set  S 
and  x°  e  S,  then  f(x°)  >  f(x)  for  all 

X  €  S. 

A. 7  The  Theorem  of  the  Separating  Hyperplane  asserts  that  if 

S  and  T  su-e  two  convex  sets  in  e"  with  no  interior 

point  in  common,  then  there  exist  an  n- vector  v  0  and 

a  scalar  c  such  that  v.s.  <  c  <  5^  v.t.  for  a.T  1 

11—  — ^  11 

_s  €  S,  ^  €  T  (see  Karlin,  1959>  P-  598  for  a  proof) . 


APPENDIX  B 


Graphical  Examples 

We  shall  illustrate  the  Basic  Conceptual  Algorithm  by  considering 
three  examples  of  the  form 

Maximize  -(x  -c!)  +  (l-a)  ^  _(x.  -  c.)^ 

(B.l)  iE  1  ^  ^  1  ^  ^ 

subject  to  a^x  +  >  0,  i  =  1,  .  .  .  ,  m  . 

The  first  example  is  well-behaved  in  the  sense  that  there  are  no 
false  or  silent  alarms  (see  section  4.1  for  definitions  of  "false" 
and  "silent"  alarms),  whereas  in  the  second  and  third  examples  such 
troubles  do  occur. 

Problems  of  the  form  (B.l)  are  among  the  simplest  which  can  be 
subsumed  under  the  present  theory:  both  objective  functions  are 
quadratic  and  linearly  separable,  and  the  constraints  are  linear. 

The  fact  that  false  and  silent  alarms  can  occiu^  for  such  problems 
seems  to  render  unlikely  the  existence  of  a  special  class  of  (iCt) 
for  which  false  and  silent  alarms  cannot  occur. 

The  examples  to  be  given  are  presented  and  analyzed  graphically 
rather  than  numerically  because  (B.l)  is  readily  amenable  to  graphical 
interpretation  when  n  =  2  (the  case  considered  here).  Let  a  be 

fixed.  When  S  is  a  consistent  set,  i.e.,  when  X„  =  (x:  a^x.  + 

S  —  — i— i 

^  i  €  S)  0,  it  follows  from  the  Kuhn- Tucker  Theorem  that 
(=S)a  is  necessary  and  sufficient  for  a  maxim\am  of  the  objective 
function  subject  to  x  e  Xg.  From  the  circularity  of  the  level 
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curves  of  this  particular  objective  function  it  is  evident  that  this 
constrained  maximum  is  just  the  point  of  Xg  nearest  to  the  uncon¬ 
strained  maximum  ^(a)  =  oc*  +  (l-a)£. 

Each  figure  is  drawn  in  x- space  (n  =  2)  with  two  constraints 
(m  =  2).  Olie  loci  of  g-|^(x)  =  0,  ggCx)  =  0,  the  unconstrained 
maximum  x(a),  and  the  constrained  meucimum  x*(a)  (the  heavy  line) 
are  drawn,  as  well  as  certain  features  pertaining  to  the  points  of 
change.  Light  lines  representing  the  projection  of  x(a)  onto  the 
feasible  region  are  also  drawn;  in  view  of  the  circularity  of  the 
level  curves  of  the  objective  function  for  fixed  a,  these  lines  are 
in  the  direction  of  the  gradient  of  the  objective  function  at  x*(a) . 
The  gradients  of  the  constraints  point  into  the  feasible  region. 

Frcffli  (=S)a  we  see  that  the  dual  variables  express  minus  the 

g 

gradient  of  the  objective  function  at  x  (a)  as  a  linear  combination 
of  the  gradients  of  the  constraints  in  S.  Hie  signs  of  u^(a) 

(i  €  S)  are  easily  determined  by  visual  inspection  of  the  figures. 

The  first  example  is  presented  graphically  in  Figure  B.  1.  At 
a  =  0  the  unconstrained  maximum  ^(0)  is  interior  to  the  feasible 
region.  Thus  the  constrained  maximum  x*(0)  equals  x(o)  and 
Bo  =  0,  which  implies  that  Ao  =  0  since  A3(  CBa  for  all  a. 

We  are  obliged  to  let  S°  =  0,  for  the  empty  set  is  the  only  valid 
set  at  a  =  0  (recall  that  S  is  valid  at  a  if  and  only  if 
AaCSOBa).  Step  1  is  complete.  Step  2  demands  that  we  solve 
(=0)a  as  a  increases  above  0  until  an  alarm  is  given,  i.e., 
until  x^(a)  leaves  the  feasible  region  or  u^(a)  becomes  negative 
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for  seme  i.  The  last  alternative  (ein  optimality  alarm)  cainnot  happen 
for  S°  =  0,  for  (=0)a  requires  u^(a)  =  0.  Only  the  first  alternative 
(a  feasibility  alarm)  can  occur.  Equations  (=0)a  are  easily  seen 
to  be  the  conditions  for  an  unconstrained  maximum.  Since  x(0)  is 
interior  to  the  feasible  region  for  0  <  a  <  no  alarms  are  given 

on  [0,a^);  (x^(a),  u^(a))  =  (x*(a),  u*(a))  =  (x(a)  ,0)  and  Ax  =  Bot  =  0 
on  [0,a^).  At  the  unconstrained  maxim\im  happens  to  be  on  the 

boundary  of  the  feasible  region,  but  beyond  it  violates  the  first 

constraint,  i.e.  (=0)q£  leads  to  a  feasibility  alarm  for  g^  Just 

above  a^.  Thus  is  the  point  of  change  which  completes  Step  2, 

and  (x^(a^),  u^(a^))  =  (x*(a^),  u*(a^))  =  (x(a^),0),  Ax^  =  0, 

=  (l).  Since  we  go  to  Step  3.  Two  sets  are  valid  at 

0  and  {l}.  The  former  was  seen  at  Step  2  not  to  be  valid 
above  and  so  the  latter  must  be.  Control  is  now  ret\irned  to 

Step  2  with  S°  =  (l). 

To  execute  Step  2  for  the  second  time  we  must  solve  (=Cl])a  as  a 
increases  above  cx^  until  an  alarm  obtains.  These  equations  are  the 
conditions  for  a  maximum  of  the  objective  function  subject  to  the 
first  constraint  being  exactly  satisfied.  As  a  increases  above 
eXj^^  x^Cck)  moves  along  the  portion  of  the  boundary  determined  by 
the  first  constraint;  since  minus  the  gradient  of  the  objective 
function  at  x^(a)  is  expressed  as  u^(cx)  times  the  gradient  of 

it  is  geometrically  clear  that  u^(a)  grows  increasingly  positive 
as  a  increases.  Hence  no  alarms  are  given  until  is  passed, 

when  the  second  constraint  begins  to  be  violated.  We  have 
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x^(a)  =  x*(a),  u^(a)  =  u*(a)  >  0,  u^Ca)  =  u*(a)  =0,  A3  =  Ba  =  {1} 
on  {a^fOL^.  Since  Og  <  1  is  the  point  of  change  at  which  Step  2 
is  completed,  we  go  to  Step  5*  Now  ^^2  ~  ^2  ~  {1^2),  so 

that  (1)  and  (1,2)  are  valid  at  a^;  since  the  former  was  seen 
not  to  be  valid  just  above  OL^,  the  latter  must  be.  Control  is 
returned  to  Step  2  again,  this  time  with  S  =  (1,2). 

Step  2  now  requires  that  {={l,2})ci(  be  solved  as  a  Increases 

above  OL^  until  an  alarm  occurs.  These  equations  are  the  conditions 

for  a  maximum  of  the  objective  function  subject  to  both  constraints 

being  satisfied  exactly.  Since  the  intersection  of  the  two  equality 

1  2 

constraints  determines  a  unique  point,  x  ’  (a)  is  constant  for  all 
a.  The  projection  lines  of  x(a)  onto  the  feasible  region  and  the 
interpretation  of  the  dual  variables  make  it  clear  that  u  '  (a)  >  0 
on  (a2,aj),  u5;’^(aj)  =  o,  Ug'^Ca^)  >  o,  and  u^'^(a)  <  o, 

Ug’^Ca)  >  0  for  a  >  a.y  In  other  words,  an  optimality  alarm  occurs 
for  the  first  constraint  just  above  a^,  so  that  Step  2  is  complete 
at  that  point  of  change.  Going  to  Step  5,  we  see  that  A3j  =  (2), 

=  (1,2);  since  the  latter  is  not  valid  just  above  the  former 

must  be.  Control  is  returned  to  Step  2  with  S°  =  (2). 

At  Step  2,  (  =  (2))a  must  be  solved  as  a  increases  above  CXy 

Reasoning  as  before,  we  see  that  (2)  remains  valid  on  [Oj,!]. 

Q  2  2 

Hence  x  («)  =  »  A3  =  B3  =  (2),  u^(a)  =  0,  and  u^Ca)  >  0 

on  (Oj,!). 

This  completes  the  solution  of  the  first  example.  A  summary 
appears  in  Table  B. 1.  Note  that  there  were  no  false  or  silent  alarms, 
and  no  erroneous  trials  at  any  Step  5* 
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The  second  and  third  examples  are  presented  graphically  in 
Figures  B.2  and  B.  3-  The  summaries  which  appear  in  the  corresponding 
Tables  B.2  and  B. 3  caji  be  constructed  by  following  the  lines  of 
reasoning  illustrated  in  the  above  discussion  of  the  first  example. 
Nevertheless,  certain  of  the  entries  are  reasoned  out  below.  The 
second  example  is  designed  to  show  that  false  feasibility  and  silent 
optimality  alarms  can  occur,  the  third  to  show  that  false  optimality 
and  silent  feasibility  alarms  can  occur. 

The  second  example  is  very  much  like  the  first,  except  that  the 
unconstrained  maximum  happens  to  pass  through  the  vertex  of  the  feasible 
region.  At  a  =  a^:  x*(a^)  =  x(a^),  and  =  {1,2}. 

At  Step  3  one  must  solve  (=S)a  for  a  just  above  S  valid  at 

until  a  set  which  is  valid  just  above  is  found.  The  four 

sets  0,  (l),  (2),  and  {1,2}  are  valid  at  If  one  tries  0, 

it  is  clear  that  x^(a)  =  x(a)  violates  both  constraints  as  a 
increases  above  0!^,  and  also  that  only  {2}  is  valid  just  above 
a^.  Hence  there  is  a  false  feasibility  alarm  for  g^^,  for  g^^  is 
not  in  the  deficiency  of  0  and  is  not  degenerate  just  above  Ct^. 

See  the  second  line  of  Table  B.2.  If  one  tries  {l},  (={l})a  are 

the  conditions  for  a  maximum  of  the  objective  function  subject  to  the 
first  constraint  being  exactly  satisfied.  It  is  evident  that  x^(a) 
violates  the  second  constraint  above  a^,  i.e.  a  feasibility 
alarm  for  gg  obtains.  Since  minus  the  gradient  of  the  objective 
function  at  x^(a)  is  expressed  as  u^(Q!)  times  the  gradient  of 
is  seen  to  be  positive  above  Thus  no  optimality 
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alarm  obtains  for  g^,  which  means,  in  view  of  the  unique  validity 
of  (2)  above  and  the  fact  that  is  in  the  excess  of  (1} 

Just  above  a^,  that  (={l))a  leads  to  a  silent  optimality  alarm. 

See  the  third  line  of  Table  B. 2. 

In  the  third  and  last  example,  the  unconstrained  maximum  again 
happens  to  pass  through  the  vertex  of  the  feasible  region.  At  a  = 
we  have  =  x(a^) ,  =  0,  and  Boc^  =  {1,2).  The  valid  sets 

at  are  0,  (l),  (2),  and  (1,2).  The  only  set  which  is  valid 

Just  above  is  (2).  If  one  tries  (l)  at  Step  3,  x^(o() 

evidently  remains  feasible.  Since  gg  is  in  the  deficiency  of  (l) 
Just  above  a^,  we  see  that  (={l))a  leads  to  a  silent  feasibility 
alarm,  as  recorded  in  the  third  line  of  Table  B.  3»  If  one  tries 
(1,2),  x^'^(a)  must  remain  at  the  intersection  of  the  two  equality 

constraints.  It  is  graphically  clear  that  minus  the  gradient  of  the 
objective  function  at  x^^^(a)  =  x*(0(^)  is  represented  by  a  negative 
linear  combination  of  the  gradients  of  the  constraints  as  Of  increases 
above  ,  so  that  optimality  alarms  occur  for  both  constraints. 

not  in  the  excess  of  {1,2}  and  is  not  degenerate  Just 
false  optimality  alarm  registers  for  the  second  con- 
the  fifth  line  of  Table  B. 3- 


Since  gg  is 
above  Of^ ,  a 
straint.  See 
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Valid  Sets 
at  a:  S 


Feasibility  and  Opti¬ 
mality  Alarms  Due  to 
S  Just  Above  a 


Deficiency  and 
Excess  of  S 
Just  Above  a 


Feasibility 


Optimality 


Deficiency 


Excess 


{1} 


None 


{1} 


None 


(1) 


None 


None 


None 


None 


(1) 


{1} 

(1,2) 

(1,2} 

(1,2) 

72) 

(2) 


(2) 

None 


None 


None 


None 


None 


(1) 


None 


(2)  I  None 


None  I  None 


None 


None 


(1) 

None 


Table  B.l 


W', 

t  i 


V,e2(s*(«i))‘ 


^x(ai)  S  X*(a^) 


Figure  B.2 


^(0)  €x(0) 
c 


g,(x)  =  O 


go(x)  -  0 


Valid  Sets  Feasibility  and  Opti¬ 
mality  Alarms  Due  to 
at  a;  S  S  Just  Above  a 


Deficiency  and 
Excess  of  S 
Just  Above  a 


Feasibility  I  Optimality  I  Deficiency  I  Excess 


[0,a^) 


(Ctl,!] 


0 

None 

{2} 

None 

{1} 

{2} 

{2} 

{1} 

{2} 

Hone 

None 

None 

None 

{1,2} 

None 

{1} 

None 

{1} 

Table  B.2 

False  feasibility  alarm  for 

Silent  optimality  alairm:  no  optimality  alarm  for  u 


y^62(x*(a^)) 


gn  (x)  =  0 


x*(a  )  =  x(«i)  -  2£*(0) 


\^g^(x*(a^)) 


Figure  B. 3 


=  0 


[0,a^) 


Valid  Sets  Feasibility  and  Opti¬ 
mality  Alarms  Due  to 
at  0£:  S  S  Just  Above  a 


Deficiency  and 
Excess  of  S 
Just  Above  a 


(1,2) 


Feasibility  I  Optimality  I  Deficiency  I  Excess 


0 

(2) 

None 

(2) 

None 

(1) 

Nonei'^ 

(1) 

(2) 

(1) 

1 

(2) 

None 

None 

None 

None 

(1,2) 

None 

None 

(1) 

Table  B. 5 

—  Silent  feasibility  alarm;  no  feasibility  alarm  for  g 

2/ 

—  False  optimality  alarm  for  u  . 
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APPEMDIX  C 

Computational  Devices 

1.  Nevton ' s  Method 

Newton's  method  is  based  on  using  successive  linear  approximations 
for  solving  systems  of  nonlinear  equations  when  good  first  approxi¬ 
mations  of  the  solutions  are  available.  In  order  to  solve  the  system 

fj^W  =  0 

(C.l)  =  0 


f  (x)  =  0  , 
n'—  ’ 


where  x  is  an  n-vector,  Newton's  method  is  the  recursion 


(C.2) 


r5f  (x^)i‘'' 

-  -4— 


k  =  0,  1,  2, 


where  x°  is  a  given  starting  point.  The  stated  inverse  must 
exist  in  order  for  (C.2)  to  be  well-defined,  of  course.  We  denote 
the  right-hand  side  of  (C.2)  by  F(x^) .  F(x)  is  the  iteration 
function  obtained  by  applying  Newton's  method  to  (C.l). 

There  are  numerous  versions  of  conditions  under  which  Newton's 
method  can  be  guaranteed  to  converge.  The  following  theorem  is 
typical. 

Theorem  C.l  1 

Ass\ime  that  fj^(x)  (i  =  l,...,n)  is  continuously  differentiable 
on  some  neighborhood  of  x*,  and  that  the  Jacobian  does  not 
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vanish  at  x*,  where  f(x*)  =  0.  Then  Newton's  method  is  well-defined 
and  quadratlcally  convergent  to  x*  if  the  starting  point  x°  is 
in  a  sufficiently  small  neighborhood  of  x*. 

See  Householder  (1955,  p.  156)  for  a  proof. 

Quadratic  convergence  of  the  sequence  <  x^  >  (k  =  0,1,....) 
to  X*  means  that  (here  H-H  denotes  the  Euclidean  norm) 

llx^  - 

■■  V  1 - 5  =  a  constant  jt  0  . 

k  «  llx^-^  .  x*||2 

By  way  of  contrast,  linear  convergence  would  mean  that 

llx^  -  x*|| 

li®  ~  V  1 — - =  ^  constant  d  0  . 

k  -  «  ||x^-l  .  x*|| 

Evidently  the  quadratic  convergence  of  Newton's  method  is  a  highly 
desirable  feature.  The  price  one  pays  for  it  is  the  necessity  of 
evaluating  an  inverse  matrix  at  each  iteration,  and  having  to  have 
a  good  starting  point.  To  ameliorate  the  first  disadvantage,  at  some 
expense  of  speed  of  convergence,  approximate  inverses  can  be  used. 

Often  one  can  achieve  a  substantial  net  gain  in  computational  efficiency 
by  Judicious  application  of  this  idea  (see,  for  example,  Ostrowski, 
i960,  and  Householder,  1955,  p.  I56) . 

For  the  purpose  of  proving  Theorems4.1  and  4.2,  we  find  it  more 
convenient  to  employ 
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inieorem  C.  2  ; 


Let  X*  satisfy  f(^)  =  O.  Assume  that  there  exists  a  neigh¬ 
borhood  N^(3^)  on  which  the  following  three  assertions  hold: 


(a) 


(b) 

(c) 


The  functions  f^(x)  (i  =  l,...,n) 
tinuously  differentiable. 


The  Jacobian  ^  ^  ^  0. 

b(x) 

L/2 


<  L  <  1. 


are  twice  con- 


Then  Newton's  method  (C.2)  is  well-defined  and  quadratically 
convergent  to  x*  if  the  starting  point  x°  is  in  N^(x*) . 

This  theorem  follows  from  results  given  in  Householder  (1955, 
p.  155)  and  Henrici  (1964,  p.  lOl). 


Remark;  The  square-root  expression  in  (c)  is  an  upper  estimate  of 
the  Euclidean  norm  of  the  Jacobian  matrix  of  F(x)  (see 
Faddeeva,  1959^  P-  121). 


For  reference  we  record  the  recursion  equation  of  Newton's  method 
applied  to  (=S)a^.  We  have,  for  k  =  0,1,2,... 


/  X  1 

1 

H 

+ 

'-1 

H  : 

H 

1 _ 

[-s] 

II 

[^1 

D  :  0 

1 

where  H  =Vx(f(f;0!)  + 

_  S 

are  (i  e  S),.  and  Ug  and 


\^{f(x;a) 

D  is  the  matrix  whose  rows 
^  are  the  vectors  obtained 
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by  deleting  from  u  and  g(x)  the  components  not  in  S;  all  quan¬ 
tities  on  the  right-hand  side  are  evaluated  at  a  and  fx  u 

O  g'  * 

Note  that  the  equations  u.  =  0  (i  S),  which  are  a  part  of 

(-S)O.^,  have  bean  omitted  from  the  recuralon  beeanaa  they  are  already 
solved. 

In  order  to  have  a  eompaat  notation  for  the  aquara-root  axpreaslon 
In  (a)  of  ttaoram  C.2  apaalallzad  to  „a  denote  by  A(h,u:  a^,s) 

the  aqnare-root  of  the  an.  of  the  aquarea  of  all  the  elananta  of  the  “ 
Jacobian  aatrl*  of  the  iteration  function  appearing  In  (C.J)  (l.a.,  of 

the  Jacobian  matrl*  of  the  right-hand  aide  of  (e.J)  conaldarad  aa 
a  vector-valued  function  of  (x,u  )). 


Partitions  of  the  Inverse  Matrix  Required  hv  w.w.ftvt.f 

Method 

Let 

r  '  tn-^ 

H  I 

D  i  0 

^  I 


be  defined  aa  In  (C.5).  Under  our  condltiona.  It  la  aaaily  verified 
that 


(C.4) 

r  H  i  1 

-X 

-  h-V(dh-V)-1dh-^ 

h'V(dh'V)'^‘ 

_D  ;  0 

(dh'V)’W^  j 

-(dh"V)"^ 

Let  there  be  s  elements  in  S  (by  Condition  4,  s  <  n) .  The 

invsrsion  of*  tliG  n+s  bv  n+s  Tnn+i"nv  >»  v. 

y  s  matrix  has  been  reduced  to  the  inversion 

of  t.o  matrlcea,  one  n  by  n  (h)  and  the  other  a  by  a  (PH-V), 

and  to  several  matrix  multiplications. 


Whereas  the  size  of  H  remains  constant  no  matter  what  S  is, 
the  dimension  of  DH“  V  does  vary  with  S,  for  during  Step  5  rows 
are  added  to  and  deleted  from  D  as  S  changes.  It  is  advantageous 
to  use  bordering  methods  to  pass  from  an  available  (dh"^D*)"^  to 
the  next  when  S  is  changed  at  Step  3-  We  shall  consider  the  case 
in  which  one  row  is  added  to  "the  bottom"  of  D,  and  also  the  case 
in  which  the  last  row  of  D  is  deleted.  Results  simlleur  to  the 
follov/ing  can  be  derived  to  cover  the  addition  or  deletion  of  an 
arbitrary  row,  and  also  multiple  additions  and/or  deletions. 

If  one  row  d  is  to  be  added  to  D,  then 


.  4-  r dh"^d“  j  m'^d*  "I 


155 


where  is  1  by  1  (i.e.,  is  a  scalar).  If  row  d  is  deleted 

from  D,  then  =  ^1  "  ^2  ^^^3' 


3-  A  Refinement  Method  for  Approximate  Matrix  Inverses 

Suppose  that  A  is  a  square  matrix  whose  Inverse  exists  and  is 

desired  to  be  found,  and  that  an  approximate  inverse  B  is  available. 

o 

The  error  Inherent  in  B  causes  the  matrix  I-AB  not  to  vanish. 

o  o 

||l-AB^||  <  L  <  1,  then  the  recursion 

\  +  \_l(^  ■  k  =  1,  2,  ... 

converges  to  A  ,  and  the  considerable  rapidity  of  the  convergence 
is  apparent  frcm  the  estimate 


IIBj^  -  A"^||  <  IIb^II  • 

See  Faddeeva  (1959^  PP.  99-102)  for  further  details  on  this 
method,  which  is  due  to  H.  Hotelling. 

It  is  clear  that  this  device  can  be  used  to  great  advantage 
in  maintaining  an  arbitrarily  accurate  approximation  to  as  a 

increases  (for  the  elements  of  H,  and  therefore  of  H"^,  are  con¬ 
tinuous  functions  of  0£  on  the  unit  interval) ,  and  also  to 
(DH  ^D^)  ^  so  long  as  S  stays  the  same. 


We  define  the  norm  1^11  of  any  n  by  n  matrix  A 


as 


Max 


a.  . 

ij 


Other  norms  could  be  used,  but  this  one  (the 


1  <  J  <  n  i=l 

so-called  "p  =  1  norm")  is  particularly  convenient  for  ccanputational 
purposes. 
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Formulae  for  d.(x^(o()>  u^(a))/dO( 

It  may  be  shown  by  implicitly  differentiating  (=S)a  that  the 
following  additional  conclusions  can  be  added  to  Theorem  2:  for 
a  €  la^, 

d(x®(a))/da  =  - 

d(ug(a))/da  =  -Vx^2^-^^  ' 

where  R  and  Q  are  as  in  section  2  above  and  all  quantities  are 

s  s 

evaluated  at  (x  (a),  Jiig(o<))" 

These  formulae  are  of  possible  interest  for  the  purpose  of 
facilitating  the  convergence  of  Newton's  method,  when  fairly  large 
step  sizes  are  being  used,  by  extrapolating  to  better  starting  points. 
Note  that  R  and  Q  are  immediately  available  from  (C.U). 
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